FHIR Search Engine - Resource Operators

Technical Consideration

solr de-normalization faster selection query performances and expressivity
solr de-normalization slower update perforamces and add complexity
solr-join can be used in solr-cloud - the joined collection SHALL be replicated over every shard.
solr-join performances are linear among the size of the joined collection (local parameter: score=none).
solr-join query time is up to 1 sec on a 1M sized collection.
solr-join only implements inner joins without access to the joined collection fields
spark-solr provides fast access to solr-cloud fields with docValues=true or type string, long, timestamp
spark-solr produces a dataframe from a solr-cloud collection thought the /export streaming expression handler
spark-solr supports any query filters to subset the collection
spark-solr offers a large set of joins, intersect, union, aggregation and windowing
fhir _query search, allows top down search - Observation from patient with identifier
fhir _has allows bottom-up search - Patient having Observation with coding
fhir _graphql does not allow temporal operators (?)

Technical Conclusion

solr-join are not compatible with large tables
top-down-search can be implemented with solr de-normalized / solr-join
top-down-search prefers solr-denormalized since the joined collection are huge (patient/encounter)
the de-normalization can cover patient/encounter details - spark can fetch them from the patient/encounter delta files (including the _lastUpdated)
the _has covers and operator but lacks or, exept, after, before, not, count and operator scope ( and )
the _has could be implemented by creating a dedicated collection with all resources in it.
the _has could be implemented with spark-solr with and operators only (patients semi-left joins)
adding a long patient field to the fhir patient resource will be usefull
faceting, and unique faceting fields and denormalized fields (extension in the bundle)
[Base]/Condition?patient.identifier=xxx&encounter.lenght=10

Implementation

It is possible from spring-data-solr to build the solr query. This query can be passed to spark-solr.

fhir-syntax -> solr-syntax

(Condition?active=false) AFTER(24, recorded-date, created-date) (Composition?status=canceled&_text=patient&type:below=letter)
ConditionAphp(q=*:&active=true) AFTER(encounter, 24, recorded-date, created-date) CompositionAphp(q=:*&status=canceled)
ConditionAphp(q=*:&active=true) AND(encounter) CompositionAphp(q=:*&status=canceled)
AFTER(AND(Condition(query), Composition(query), encounter), Medication(query), patient, "start-date", "end-date")

AND(Dataframe, Dataframe, Column) OR(Dataframe, Dataframe, Column) EXCEPT(Dataframe, Dataframe) AFTER(Dataframe, Dataframe, column, column, column) OCCURRENCE(Dataframe, number) NOT_IN(Dataframe, Dataframe)

import org.springframework.data.solr.core.DefaultQueryParser
DefaultQueryParser b = new DefaultQueryParser(new SimpleSolrMappingContext());
String z = b.createQueryStringFromNode(search.getCriteria(), null);

Things go like this:

fhir-client: fhir-syntax-operators -> fhir-backend: fhir-syntax-operator => solr-syntax-operator -> spark-backend: parser,joins, results -> create a patient list within solr

Group versus List

it is possible to filter resources by Group thanks to reverse chaining
[Base]/Patient?_has:Group:member:_id=xxxx
[Base]/Encounter?patient._has:Group:member:_id=xxxx
[Base]/Observation?patient._has:Group:member:_id=xxxx
[Base]/Composition?patient._has:Group:member:_id=xxxx
it is possible to filter resources by List thanks to _list parameter:
[Base]/Patient?_list=xxxx where _list is a list of patients
[Base]/Encounter?_list=xxxx where _list is a list of encounters

Results Informations

the resulting json should contain:
- the results on the whole
- the results on the practitioner ward
- the faceting results

Most relevant FHIR ressources have two, three or four of those elements:

id
patient
encounter
date

Patient Level

Every filtered ressource being linked to a patient it is possible to operate on them all:

val r0 = (1::2::3::6::7::Nil).toDF("patient")
val r1 = (1::2::3::Nil).toDF("patient")
val r2 = (2::3::6::Nil).toDF("patient")
val r3 = (1::3::7::Nil).toDF("patient")

`r1 AND r2 AND NOT r3`: r1.intersect(r1).join(r3,"patient"::Nil,"leftanti").dropDuplicates.show
`r1 OR r2 OR NOT r3`: r1.union(r2).union(r0.join(r3,"patient"::Nil,"leftanti")).dropDuplicates.show
`(r1 OR r2) AND r3`: r3.intersect(r1.union(r2)).dropDuplicates.show

Encounter Level

val r1 = ((1,10)::(1,11)::(3,12)::Nil).toDF("patient", "encounter")
val r2 = ((1,11)::(3,12)::Nil).toDF("patient", "encounter")
val r3 = (1::3::7::Nil).toDF("patient")

`r1 SAME_ENC r2`: r1.join(r2, "encounter"::Nil, "leftsemi").select("patient")
`r1 SAME_ENC NOT r2`: r1.join(r2, "encounter"::Nil, "leftanti").select("patient")
`r1 SAME_ENC r2 OR_PAT r3`: r1.join(r2, "encounter"::Nil, "leftsemi").select("patient").union(r3)

Date Level

val r1 = ((1,new Date(1L))::(1,new Date(2L))::Nil).toDF("patient", "date")
val r2 = ((1,new Date(1L))::(3,new Date(2L))::Nil).toDF("patient", "date")

`r1 AFTER r2`: r1.join(r2, "encounter"::Nil, "leftsemi").select("patient")
`r1 AFTER_24h r2`: r1.as("r1").join(r2.as("r2"), "patient"::Nil, "left").filter(col("r1.date").>=(expr("r2.date + INTERVAL 24 HOURS")))
`r1 NOT AFTER_24h r2`: r1.as("r1").join(r2.as("r2"), "patient"::Nil, "full").filter(col("r1.date").isNull || !col("r1.date").>=(col("r2.date")))

All Together

Number Of Occurrence

React ?

This page was last modified: 2020-06-06 22:05