33
I Workshop de Grupos de Investigación Españoles de IA en Biomedicina (IABiomed 2018) S ESIÓN 2

I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

I Workshop deGrupos de InvestigaciónEspañoles de IAen Biomedicina(IABiomed 2018)

SESIÓN 2

Page 2: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es
Page 3: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

I Workshop de Grupos de Investigacion Espanoles de IA en Biomedicina

1267

Hacia un indicador temprano de Deterioro Cognitivo válido como herramienta de screening

Rafael Martínez-Tomás, Mariano Rincón-Zamorano, Alba Gomez-Valadés

Dpto. Inteligencia Artificial, Universidad Nacional de Educación a Distancia

UNED Madrid, Spain

Héctor Gómez, Susana Arias Facultad de Ciencias Humanas y Educación

Universidad Técnica de Ambato Ambato, Ecuador

Abstract—Las enfermedades neurodegenerativas son

causantes de deficiencias cognitivas que disminuyen

gravísimamente la calidad de vidas de pacientes y familiares, se

han convertido en uno de los problemas más graves de la

sociedad actual, cada vez más envejecida. En esta tipología de

enfermedades, la identificación temprana, en los primeros

estadíos, puede aliviar y ralentizar el desarrollo de las mismas. Si

bien, para alguna de estas enfermedades (ej. Alzheimer), existen

ya indicadores tempranos claros, es importante contar con un

mecanismo barato, rápido, poco intrusivo y eficiente para la

identificación temprana. En este artículo se resumen los

principios y resultados de nuestra investigación con este objetivo.

Hemos trabajo con test neuropsicológicos, de producción

semántica y gráficos, obteniendo resultados prometedores.

También hemos experimentado con la disociación entre la

orientación del contenido semántico/emocional oral y la

expresividad que le acompaña, en este caso tonalidad, en su

articulación. Por último, describimos nuestro trabajo en pro de

una mejor integración de resultados de experimentos en estas

áreas, que proceden de diferentes fuentes y en diferentes

formatos, gracias al diseño de un marco ontológico que sin duda

colaborará en una mayor reutilización de estos datos para

análisis más completos y profundos.

Palabras clave—MCI, neuropsicological test, Alzheimer

disease, machine learning, computer vision, bayesian networks.

I. INTRODUCCIÓN

Se estima que para el año 2020, 48,1 millones de personas en todo el mundo sufrirán problemas relacionados con el síndrome clínico que denominamos demencia [1] y se triplicará para 20501. Quizá por el alargamiento de la esperanza de vida, la enfermedad de Alzheimer (AD), que representa ya la primera causa de demencia neurodegenerativa [2], es el problema más preocupante y analizado de los últimos años relacionado con este síndrome. Es una patología degenerativa causante de deterioro cognitivo y de problemas de comportamiento y funcionales, que no solo produce un alto impacto en la calidad de vida del paciente sino también de sus familiares y cuidadores. AD se está atacando desde múltiples perspectivas, por un lado, buscando terapias para retardar la

1https://www.ceafa.es/es/que-comunicamos/noticias/la-oms-advierte-que-para-

el-2050-se-triplicara-la-cifra-de-personas-que-padecen-demencia

evolución neurodegenerativa, y por otro, buscando su detección precoz. Desde el diagnóstico temprano se puede prestar un mayor apoyo al paciente, monitorizar la enfermedad y administrar los tratamientos que permitan mantener su calidad de vida durante más tiempo. Además, como en todos los procesos neurodegenerativos, en un ciclo virtuoso, cuanto antes se diagnostique, mejor se puede estudiar la evolución y consecuentemente conocer mejor la enfermedad.

El mejor y más temprano indicador de la enfermedad de Alzheimer es el índice de concentración de las proteínas tau y beta-amieloide en el fluido cerebroespinal, pero requiere de un método invasivo, la punción lumbar, muchas veces contraindicado. En general, el diagnóstico del deterioro cognitivo puede ser costoso temporal y económicamente, pues requiere la evaluación de información procedente de diferentes fuentes (neuropsicológica, test de laboratorio, neuroimagen, datos históricos, demográficos, personales, etc.) y cuya precisión y eficiencia vienen determinadas por el nivel de pericia del personal profesional. Por su lado, la evaluación neuropsicológica se ha revelado como una herramienta esencial para la detección temprana de deterioro cognitivo, permitiendo identificar déficits funcionales consecuencia de alteraciones en el entramado neuronal producido por enfermedades neurodegenerativas. El DSM-5, la 5ª edición del Diagnostic and Statistical Manual of Mental Disorders del 2013 de la American Psychiatric Association, define seis dominios de la función cognitiva: atención compleja, función ejecutiva, aprendizaje y memoria, lenguaje, función visoespacial y cognición social, cada uno con un determinado número de subdominios.

Nuestro grupo ha utilizado técnicas y metodologías propias de la Inteligencia Artificial (IA) para el análisis de estas evaluaciones neuropsicológicas con el objetivo de obtener herramientas para el screening de población, es decir, herramientas para la identificación temprana de la deficiencia cognitiva, poco invasivas, económicas y sin riesgos secundarios. En concreto, hemos utilizado modelado ontológico para estructurar el conocimiento disponible y poderlo utilizar en inferencia, redes bayesianas para combinar conocimiento de expertos con aprendizaje a partir de datos, y diferentes métodos de aprendizaje automático cuando el volumen de datos lo ha permitido.

Page 4: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

XVIII Conferencia de la Asociacion Espanola para la Inteligencia Artificial

1268

Para conseguir este fin, hemos formado un equipo multidisciplinar en el que participan investigadores del Dpto. de Inteligencia Artificial y del Dpto. de Psicología Básica de la UNED, el Centro de Referencia de la Enfermedad de Alzheimer (CREA) de Salamanca y el Dpto. de Geriatría del Hospital de Zamora. Asimismo, colaboramos con el grupo dirigido por Dr. Héctor Gómez de la Universidad de Ambato, la residencia del perpetuo socorro de Quito y otros centros asistenciales de Ecuador, y asimismo varios hospitales universitarios noruegos.

En este artículo hacemos un repaso por nuestros trabajos más relevantes en esta línea, y describimos las técnicas empleadas y los resultados obtenidos. En el apartado 2 se describe nuestro trabajo en dos tipos de test neuropsicológicos. Primeramente, mostramos nuestro trabajo en el análisis de la producción oral de categorías semánticas mediante el uso de redes bayesianas (BNs), lo que permite combinar conocimiento experto con el corpus de datos disponible. A continuación, nos centramos en otro tipo de test, también muy habitual en la evaluación cognitiva, que se fundamenta en la copia a mano alzada de figuras geométricas esquemáticas. Aquí, para el análisis, se hace uso de técnicas de visión artificial y aprendizaje automático. En el apartado 3, describimos nuestro trabajo para la identificación de déficits expresivos que puedan asociarse con deterioros cognitivos, para lo cual se analiza la orientación semántica/emocional de la conversación y la correspondiente entonación, buscando inconsistencias en las asociaciones mediante diferentes técnicas de aprendizaje automático. Ya en el apartado 4, se presenta otra línea de trabajo, en la cual utilizamos modelado ontológico para crear herramientas de apoyo a la investigación en consonancia con la tendencia actual de los proyectos de investigación en neurociencia -proyectos multidisciplinarios, heterogéneos y multicentro-, dónde la compartición de datos y la accesibilidad son cruciales. Por último, en el apartado de conclusiones, resumimos los logros obtenidos, los problemas encontrados y las limitaciones de nuestras implementaciones, extrayendo conclusiones generales y futuras líneas de investigación.

II. APOYO A LA IDENTIFICACIÓN TEMPRANA DEL DETERIORO

COGNITIVO DESDE TEST NEUROPSICOLÓGICOS

Los test neuropsicológicos (Neuro-Psychological tests, NPT) se han mostrado muy útiles en la evaluación de los diferentes dominios y subdominios de la función cognitiva mediante la realización de tareas estandarizadas [4], pero es un tema de debate si unas tareas determinadas son las mejores [5], y, por tanto, qué test o combinación de ellos, y qué medidas para su evaluación, permiten clasificar tipologías y determinar etapas del desarrollo del deterioro con más precisión. El debate, parece claro que incluye una dosis de subjetividad [6] [7] y también de propiedad intelectual (test con propiedad intelectual). También existe una inercia a seguir utilizando los mismos test para facilitar la comparativa con otros estudios y para mantener la metodología en estudios longitudinales. En todo caso, hay un consenso generalizado en que los NPTs son una herramienta que, con unos requerimientos muy simples (un papel y un lápiz en algunos casos), es fundamental para determinar claves de la organización de la actividad cerebral y su implicación en los trastornos cognitivos. La lista de test

podría optimizarse o acortarse, con el objetivo de reducir costes económicos y de tiempo, de cara a su aplicación en campañas de screening en amplios sectores de la población.

A. Análisis de la producción oral con Redes Bayesianas

En [8,9], asumimos que un método de diagnóstico temprano del AD, basado en el análisis de la producción oral para identificar el deterioro de la memoria semántica, era un método que cumplía con nuestros objetivos: es un método simple, barato, y accesible a la población general y sirve como un primer filtro para test posteriores más caros y precisos. Además, nuestra hipótesis era que las BNs eran la herramienta apropiada para alcanzar este objetivo, ya que combinan el conocimiento a priori de los expertos (modelo cualitativo de la BN, su estructura) con el conocimiento aprendido de los conjuntos de datos de casos resueltos (modelo cuantitativo de la BN, probabilidades a priori).

En esta investigación, hicimos uso de dos fuentes de información para el diseño cuantitativo de la BN. Por un lado, el corpus de definiciones orales de Peraita y Grasso [10], que proporciona información sobre las diferencias en la producción semántica entre las personas sanas y los pacientes con CI (diagnosticados como AD leve y moderado). Peraita y Grasso analizaron las características semánticas de las definiciones del corpus, clasificando las definiciones de los pacientes dentro de 11 bloques conceptuales básicos: taxonómicos, tipos, partes funcionales, evaluativos, lugar/habitat, comportamiento, causas/generaciones procedimientos, ciclo de vida y otros. El test consistía en describir seis conceptos, pertenecientes a dos categorías semánticas diferenciales, criaturas vivas y artefactos, contar el número de elementos producidos por cada individuo para cada bloque conceptual y usar este número para obtener probabilidades a priori. Los autores del corpus detallaron los resultados de los casos recopilados (81 personas procedentes de diferentes hospitales de Madrid, 42 sanas y 39 diagnosticadas con AD leve o moderado) y analizaron cómo ciertas categorías semánticas eran representadas mentalmente, empleando modelos teóricos de características semánticas obtenidas de las tareas lingüísticas explícitas.

Por otro lado, el estudio epidemiológico de Fernández et al. [11] relacionó edad, sexo y nivel educativo con AD, lo que nos permitió relacionar la producción lingüística con el deterioro cognitivo (CI) y los estudios epidemiológicos con el AD y, debido a que el CI es un indicador temprano del AD, relacionar el CI con AD. Así, combinando esas dos fuentes pudimos cuantificar la influencia de las variables del modelo.

Se desarrolló una aplicación software en Java para gestionar todo el proceso de adquisición de datos para la BN. La inferencia bayesiana fue implementada a través de los módulos de inferencia desarrollados en Elvira2..Entrenamos nuestra red BN con nuestro modelo de aprendizaje propio, empleando como gold estándar el diagnóstico llevado a cabo por los neurólogos, quienes seguían los Criterios NINCDS-ADRDA del Alzheimer. Dado el pequeño número de casos disponibles en el corpus, se empleó validación cruzada leave-one-out para evaluar el modelo. La curva ROC obtenida se

2 Actualmente denominado Open Markov: http://www.openmarkov.org

Page 5: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

I Workshop de Grupos de Investigacion Espanoles de IA en Biomedicina

1269

podía usar para estimar el punto de corte óptimo de acuerdo al compromiso entre verdaderos positivos y falsos positivos. Dado que el objetivo era el diagnóstico temprano del AD, lo ideal sería que el sistema detectara todos los casos de AD, aunque el número de personas sanas clasificadas como AD fuera alto. De esta forma, el umbral óptimo en la distribución de probabilidad posterior continua sería aquella que maximizara el número de verdaderos positivos y mantuviera el número de falsos positivos lo suficientemente bajo; sin embargo, para obtener el umbral óptimo, hubiera sido necesario un análisis de coste que quedó fuera del trabajo. Por ello, consideramos nuestra BN como otro método para apoyar el diagnóstico del AD junto a otros test complementarios, y seleccionamos el punto de corte óptimo como el punto de la curva ROC con la máxima precisión. La Tabla 1 muestra las métricas de rendimiento para este umbral.

TABLE I. MÉTRICAS DE RENDIMIENTO OBTENIDAS CON LA BN PROPUESTA

Performance Metrics Values

TP rate, TP/(TP + FN) 0.8718 FP rate, FP/(FP + TN) 0.0476

Precision, TP/(TP + FP) 0.9444 Accuracy, (TP + TN)/(P + N) 0.9136 AUC, area under ROC curve 0.9621

El estudio concluyó con la confirmación de las ventajas del modelo BN y, por lo tanto, consideramos que son los modelos más adecuados para los sistemas de diagnóstico y para el estudio de la enfermedad a partir del análisis de la producción oral de las características semánticas. Concluimos que:

- El modelo de BN, su estructura y sus parámetros, es reconocible por el experto.

- El modelo cuantitativo puede ser ajustado a partir de una base de datos de casos y estudios, permitiendo, por tanto, la recopilación tanto el conocimiento del experto como del conocimiento implícito en los datos disponibles.

- El modelo de BN, junto al algoritmo de aprendizaje automático, puede ser empleado con otro conjunto de categorías semánticas, tan amplio como sea posible o deseado (lógicamente, el corpus debe poseer los casos suficientes).

Además, este método podría ser extendido con nuevas herramientas para proporcionar una gran cantidad de conocimiento sobre el deterioro de la memoria semántica irregular y contribuir al diagnóstico temprano de una forma extremadamente económica y accesible:

- Se podrían incorporar nuevas variables a nuestro modelo de BN, por ejemplo, variables sociodemográficas y clínicas asociadas, dónde la cooperación entre equipos multidisciplinarios podría ser esencial.

- Nuestro sistema podría ser ampliado con un mecanismo de explicación, para lo cual sería especialmente útil medir la sensibilidad del modelo a los cambios en las evidencias.

- Nuestro método puede ser ampliado con análisis de decisiones (diagramas de influencia) para maximizar la utilidad esperada de varias opciones abiertas en la decisión.

Por ejemplo, podría maximizarse la utilidad esperada de las recomendaciones para hacer un test neuropsicológico concreto o exploraciones complementarias como los análisis bioquímicos, PET, etc.

- Disponer de más casos permitiría un análisis más profundo y seleccionar las variables más discriminantes para reducir el tiempo de los test.

B. Estudio de los test neuropsicológicos gráficos empleando

visión artificial y aprendizaje automático.

Una parte de las pruebas neuropsicológicas estandarizadas diseñadas para evaluar el deterioro cognitivo leve (MCI) incluyen la reproducción o copia de figuras geométricas (test neuropsicológicos gráficos, G-NPT). La idea principal de este método es que ciertas distorsiones respecto a los patrones pueden indicar diferentes grados y perfiles de MCI. Además, no solo se puede evaluar la distorsión respecto al patrón, sino también la evolución en el proceso de ejecución a lo largo de los años. Existen figuras de este tipo en distintos test, como el Mini Examen Cognoscitivo (MEC) [12], el Test Barcelona [13] o la figura compleja de Rey [14].

Los procesos y funciones cognitivas —dominios cognitivos— que supuestamente se evalúan durante la ejecución de esas figuras son la función ejecutiva, la percepción visual y/o visoespacial, las habilidades motrices y la memoria especial. En cada uno de esos procesos o funciones, los rasgos o componentes (control, inhibición, planificación, etc.) involucrados en los patrones a analizar pueden ser definidos, por ejemplo, siguiendo las indicaciones del Dr. Peña Casanova con respecto a los dibujos de gráficos alternantes y bucles.

Mientras que en otros tipos de pruebas estandarizadas que evalúan la memoria episódica, la fluencia verbal, etc. es mucho más fácil obtener datos normativos para puntuar las pruebas sin subjetividad, en estos test que involucran la reproducción y copia de figuras, es mucho más difícil, imponiéndose la subjetividad y cierta discrecionalidad por parte del evaluador. Aunque existe un criterio de evaluación, existe también un gran componente de subjetividad que puede socavar la fiabilidad si las discrepancias entre evaluadores no son corregidas [15]. Como resultado, puede haber un importante margen de error en la detección de ciertos problemas de habilidades motrices, percepción visual o visoespacial, etc. dentro de un entorno para la detección temprana de MCI.

En [16] utilizamos técnicas de visión artificial y aprendizaje automático para automatizar este análisis. Aunque el objetivo a largo plazo era trabajar con todas las figuras de las pruebas mencionadas anteriormente, aquí nos centramos en la figura “pico-meseta”, que es una de las subpruebas del Test Barcelona. El valor diagnóstico de esta prueba, seleccionada para este análisis exploratorio automático, se basa en el hecho de que en su ejecución están involucrados algunos componentes de la función ejecutiva, como la serialización, la planificación, la flexibilidad, la inhibición, así como la capacidad práxica. La tarea consiste en copiar una figura en la que se alternan picos y mesetas. Las distorsiones que pueden emerger en la ejecución de esta figura pueden ser de diferentes tipos: variaciones en el tamaño de la figura realizada, alteración

Page 6: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

XVIII Conferencia de la Asociacion Espanola para la Inteligencia Artificial

1270

de las características por la adición o eliminación, garabateo, perseveraciones, rotaciones, acercamiento al patrón, etc. Esos errores o alteraciones en la reproducción del dibujo pueden ser marcadores de disfunciones más severas, que pueden ser de tipo apráxico, en los cuales las funciones ejecutivas también están involucradas, como en la demencia tipo Alzheimer [17].

Dependiendo de los resultados obtenidos en varios NPTs, los participantes fueron clasificados de acuerdo a dos perfiles cognitivos, individuos sanos (n=16), aquellos con puntuaciones normales dentro de las escalas de referencia, e individuos con MCI (n=24). Se desarrolló un algoritmo de visión artificial para 1) segmentar el dibujo, 2) extraer las líneas del dibujo, 3) reconocer los componentes del patrón (picos y mesetas), 4) caracterizar el dibujo y las discrepancias dibujo-patrón (15 características relacionadas con la posición respecto al patrón, diferencia de tamaño, inclinación del dibujo, fallos en el patrón pico-meseta, etc.) y 5) finalmente, establecer un diagnóstico mediante un clasificador implementado con un árbol de decisión J48. El clasificador obtuvo una precisión de 0.77 y una exahustividad de 0.77.

El problema fundamental del diagnóstico del AD a partir del aprendizaje automático, al igual que en la mayoría de los estudios neurológicos, es la ausencia de un conjunto de datos lo suficientemente grande para construir un sistema fiable de diagnóstico. Tenemos que reconocer que, con esta pequeña muestra, solo pudimos concluir que estas características propuestas para el diagnóstico del MCI eran buenas candidatas, pero que por sí solas no permitían distinguir diferentes tipos de MCI. Para trabajos futuros, estamos ampliando la muestra y queremos combinar características de diferentes figuras para mejorar el rendimiento de la clasificación. De esta forma, este trabajo intenta ser un estudio piloto de un trabajo mucho más amplio, en el cual, a partir de los datos longitudinales ya disponibles, podremos realizar diferentes tipos de análisis:

– Análisis longitudinales de controles sanos: comparar, a lo largo de una serie de años, la ejecución de los dibujos de los sujetos de control sanos, para verificar su estabilidad en la realización de las figuras.

– Análisis longitudinales de los MCIs estables: el mismo tipo de aproximación, pero en otro grupo de sujetos, para intentar responder a las siguientes preguntas: ¿Es la ejecución de las figuras estable? ¿De todas ellas? ¿Por cuánto tiempo?

– Análisis longitudinales de los MCIs que evolucionan bien a AD bien otra demencia.

– Análisis transversales de diferentes tipos de MCI: estudios de grupo con MCI amnésico, multidominio y no amnésico.

– Relación entre la ejecución de la figura y la apraxia ideomotora: para estudiar la relación entre la ejecución de las figuras y la apraxia ideomotora obtenida por los mismos sujetos en otros test.

– Análisis sociodemográficos: para analizar las relaciones entre las variables sociodemográficas, tales como la edad, el género o el nivel educativo, y la ejecución de los patrones de las figuras mencionadas anteriormente.

Pero, por otro lado, los G-NPTs sufren de la ambigüedad inherente a la valoración por diferentes especialistas, tanto en las características significativas de los dibujos como de la clasificación del paciente según el resultado de la misma. Para tratar este problema parece adecuado la aplicación de técnicas de IA, para tratar con las ambigüedades de forma más natural, coherente con la interpretación humana, evitando tomar decisiones "crisp" (según la teoría de conjuntos clásica), por ejemplo, con técnicas difusas o probabilistas.

Sobre determinados G-NPTs, el uso de técnicas de ML ha demostrado altos niveles de precisión, pero se ha trabajado sobre cada test de forma independiente por la incertidumbre en la asociación dibujo-patrón. Hay muchos tipos y es necesaria una asociación de los G-NPTs con las capacidades cognitivas y motoras, y de estas con regiones y dinámica cerebral. Se echa de menos la definición de ontologías que los modelen, organicen y enlacen.

III. ANÁLISIS DEL DÉFICIT DE EXPRESIVIDAD USANDO ANÁLISIS

DE SENTIMIENTO

En [18] planteamos la hipótesis de que el contenido semántico/emocional del discurso y las manifestaciones físicas asociadas (gestos, entonación…) se alinean en personas cognitivas normales, pero no en aquellas personas afectadas por la demencia. Por ello, planteamos este elemento para identificar pacientes con problemas cognitivos en la etapa inicial.

El estudio de cambios de comportamiento, analizado con resultados favorables a partir de texto escrito en personas mayores [19], podría ayudar a prevenir enfermedades ligadas a dichos cambios. El análisis de la voz se ha usado para indicar signos de la enfermedad en ancianos [20]. Nosotros tomamos en cuenta la semántica de una conversación y la emoción en la tonalidad de la voz conjuntamente, en vez de por separado, para verificar si el contenido semántico-emocional del texto de un entrevistado correlaciona con el tono de voz. Si no están relacionados, podríamos suponer que se debería avisar a los familiares o cuidadores acerca de los cambios en el comportamiento del paciente.

Para el análisis, seleccionamos dos características asociadas con la semántica de un texto: la polaridad y la orientación semántica. La polaridad (positiva, negativa o neutra) está basada en diccionarios especializados y usamos la herramienta SentiStrength

3 para medirla. La orientación semántica es una medida propuesta por [26] que ayuda a clasificar frases como “Excelente” o “Pobre”. La orientación semántica funciona con las opiniones que las personas tienen acerca de las palabras o frases en un contexto predeterminado. Se busca esta orientación en frases similares mediante Google4 Api.

Las variaciones de la entonación que una persona le da a una frase mientras habla se asocia con variaciones emocionales. Para ello se usó la herramienta Speech Recognition System

5. De aquí, la emoción de la voz se puede

3 http://sentistrength.wlv.ac.uk/ 4 http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=%s 5 https://www.linkedin.com/pulse/speech-emotion-recognition-system-matlab-source-code-luigi-rosa

Page 7: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

I Workshop de Grupos de Investigacion Espanoles de IA en Biomedicina

1271

comparar con la polaridad y con la orientación ya que se clasifican dentro de bandas positivas -excelentes y negativas- pobres. Nos ha interesado especialmente el análisis e interpretación de los datos, más que la investigación en las técnicas de extracción de características.

Las entrevistas (texto+audio) del grupo de control, formado por pacientes ancianos sanos, se obtuvo de Charlotte6. Los resultados mostraron que no había una gran variabilidad entre los valores etiquetados por los expertos y las herramientas empleadas. No se obtuvo disociación entre polaridad, orientación semántica y entonación. Después se probó con pacientes con AD en una etapa temprana y sí se obtuvo disociación relevante (p-valor<0,05). Los valores obtenidos de polaridad, orientación semántica y entonación utilizaron para construir varios clasificadores que distinguieran entre personas sanas y AD. Los clasificadores se probaron con variables individuales y combinadas, proporcionando mejores resultados para el F1Score cuando se trabajó con las 3 variables simultáneamente. Los resultados demuestran que es posible utilizar el método para el diagnóstico, y que las variables trabajan mejor juntas que cuándo son usadas por separado en los clasificadores.

TABLE II. COMBINING SENTISTRENGTH-PMI-TONALITY

Algorithm Precision Recall F1Score

J48 0,88 0,78 0,83

Multilayer perceptron 0,75 0,86 0,80

Bayes Net 0,79 0,91 0,85

IV. UN MARO SEMANTICO INCREMENTAL PARA MEJORAR LA

ACCESIBILIDAD EN SISTEMAS BIOBANKING

Como ha quedado patente a lo largo del artículo, uno de los problemas con el que nos encontramos es la falta de una muestra significativa para los estudios. Por ello, es fundamental poder integrar datasets de distintos estudios, con muy diferentes características y contextos. El intercambio y accesibilidad de los datos permite unir los esfuerzos entre diferentes grupos de investigación, así como en los estudios de replicación, cruciales para el progreso en el campo. Las soluciones de los archivos de datos de investigación están evolucionando rápidamente para abordar esas necesidades, sin embargo, la integración de los datos distribuidos es todavía complicada debido a la necesidad de acuerdos entre los diferentes modelos de datos. Para abordar estos problemas, las ontologías son ampliamente empleadas en investigación biomédica para obtener vocabularios comunes y descripciones lógicas, pero su aplicación podría sufrir de problemas de escalabilidad, sesgos de dominio, y pérdida de acceso de los datos de bajo nivel. Con el objetivo de mejorar los modelos de aplicación semántica en los sistemas de biobanking, hemos diseñado e implementado un marco semántico incremental que aprovecha los últimos avances en ontologías biomédicas y en la plataforma XNAT [21]. Se trata de una aproximación metodológica para abordar el problema de habilitar modelos basados en la semántica en sistemas de archivos de investigación ya implementados. Esto mejora la gestión de los

6 http://newsouthvoices.uncc.edu/nsv/southernpiedmont

datos, desde datos de bajo nivel a conceptos semánticos y lógicos. Construido con tecnologías de la Web Semántica y empleando ontologías biomédicas, el marco proporciona un modelo homogéneo de acceso a datos y razonamiento sobre datos neurológicos multimodales.

El diseño del marco sigue una aproximación de abajo a arriba por capas, permitiendo trabajar con los datos a diferentes niveles de descripción. El marco añade capacidades de razonamiento a partir de las relaciones implícitas y las definiciones lógicas para derivar nuevos datos, y realiza comprobaciones de coherencia de los datos para el Control de Calidad. El empleo de los principios de enlazado de datos permite el enlace entre datos, abriendo las puertas a conjuntos de datos externos de referencia. Además, el tener un conjunto de datos altamente enlazados facilita la inspección desde diferentes conceptualizaciones (proyecto, tema, enfermedad, etc.), una característica altamente deseable para el descubrimiento de patrones y el estudio de las relaciones entre enfermedades a medida que el conjunto de datos crece.

Nuestra propuesta se diferencia de trabajos anteriores en que se centra en las consultas y el razonamiento avanzados sin perder los datos de bajo nivel, mientras aprovecha las ventajas de las plataformas de archivos ya disponibles y ampliamente utilizadas. En particular, elegimos XNAT como la columna vertebral para el manejo de los datos y las imágenes clínicas, debido a su amplio conjunto de características y su diseño flexible y personalizable.

Este marco está siendo utilizado en el JPND (EU Joint Program for Neurodegenerative Disease)7/APGeM project8, destinado a encontrar biomarcadores tempranos para la enfermedad de Alzheimer y otras demencias relacionadas [22] Comprende una cantidad significativa de datos de diferentes subdominios y modalidades, como neuroimágenes, bioquímica, rastreos clínicos/ neuropsicológicos y genéticos, estableciendo un escenario adecuado para impulsar y probar el marco en el curso de la investigación neurológica actual.

V. CONCLUSIONES

El problema de identificar etapas tempranas de enfermedades neurodegenerativas se puede atacar desde múltiples perspectivas. Aunque hemos experimentado con el déficit de la producción gestual en personas con demencia, este factor, medido en nuestro trabajo mediante la disociación entre gesto, tonalidad, y contenido semántico-emocional del discurso, no nos sirve como indicador neuropsicológico temprano de la enfermedad porque características subjetivas, como la motivación, la depresión, el estrés o simplemente el estado de ánimo, pueden hacer que personas sanas presenten signos de disociación semejantes.

Por eso nos estamos centrando más en identificar el deterioro cognitivo mediante test neuropsicológicos, que han demostrado ser una solución rápida y económica en la que vale la pena emplear recursos. Hemos investigado en la automatización y evaluación automática de diferentes

75http://www.neurodegenerationresearch.eu/

8http://www.neurodegenerationresearch.eu/publication/apgem/

Page 8: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

XVIII Conferencia de la Asociacion Espanola para la Inteligencia Artificial

1272

mecanismos, el test de producción oral y test gráficos. Hay otros test, como el test de fluidez verbal, que aportan conclusiones diferentes y que tendrán que ser analizados en detalle y, seguramente, automatizados en próximos trabajos.

En cualquier caso, cada día se incrementa de manera no lineal la cantidad de datos de historias clínicas almacenadas, por lo que resulta natural que los nuevos trabajos de investigación traten de concentrar toda esta información. La manera de investigar está cambiando desde una investigación individualista, en la que es necesario proponer un modelo que explique o que sea consistente los pocos datos disponibles conocidos, hacia un modelo cooperativo, en el que los estudios multicentro, multidisciplinares, multidimensionales y heterogéneos, serán lo habitual, y el modelado del conocimiento y el aprendizaje automático a partir de datos cobrará cada vez más importancia. Para ello, como muestra el último trabajo presentado, es fundamental un marco ontológico que oriente la integración de diferentes fuentes, formatos de datos y resultados, para un mejor aprovechamiento en experimentos más completos y profundos, que ayuden a conocer mejor la enfermedad y su afectación sobre funciones cognitivas y áreas cerebrales.

RECONOCIMIENTOS

- Proyecto 018-ABEL-CM-2013, financiado por el programa NILS Science and Sustainability con el The Intervention Centre del Oslo University Hospital

- Ayuda predoctoral PEJD-2017-PRE/TIC-4406 confinanciada por la UNED y el Fondo Social Europeo a través del Programa Operativo de Empleo Juvenil y la Iniciativa de Empleo Juvenil.

REFERENCES

[1] M. Prince, R. Bryce, E. Albanese, A Wimo, W. Ribeiro and C.P. Ferry. “The global prevalence of dementia: a systematic review and metaanalysis”. Alzheimer's & Dementia, vol. 9(1), pp.63-75, 2013.

[2] H.W. Querfurth and F.M. La Ferla. “Alzheimer's disease”. The New England Journal Medicine, vol. 362, no. 4, pp. 329–344, 2010.

[3] S. G. Gauthier. “Alzheimer's disease: the benefits of early treatment”. European Journal of Neurology, vol. 12(s3), pp.11-16, 2005.

[4] M.D. Lezak. Neuropsychological assessment. Oxford University Press, USA, 2004.

[5] G. Gainotti, D. Quaranta, M.G. Vita and C. Marra. “Neuropsychological predictors of conversion from mild cognitive impairment to Alzheimer's disease”. Journal of Alzheimer's Disease, vol. 38(3), pp. 481-495, 2014.

[6] H.C. Fichman, R.M. Oliveria and C.S. Fernandez. “Neuropsychological and neurobiological markers of the preclinical stage of Alzheimer's disease”. Psychology & Neuroscience, vol. 4(2), pp. 245-253, 2011.

[7] R.H. Logie, M.A. Parra and S. Della Salla. “From cognitive science to dementia assessment”. Policy Insights from the Behavioral and Brain Sciences, vol. 2(1), pp. 1-91, 2015.

[8] J.M. Guerrero, R. Martinez-Tomas, M. Rincon and H. Peraita. “Diagnosis of Cognitive Impairment Compatible with Early Diagnosis of Alzheimer’s Disease”. Methods of information in medicine, vol. 55(1), pp. 42-49, 2016.

[9] J.M. Guerrero, R. Martínez-Tomas and H. Peraita. “Bayesian Network Based Model for the Diagnosis of Deterioration of Semantic Content Compatible with Alzheimer’s” Disease. Foundations on Natural and Artificial Computation. Lecture Notes on Computer Science, vol 6686, pp.461-470, Springer, 2011

[10] H. Peraita and L. Grasso. Corpus lingüístico de definiciones de categorías semánticas de personas mayores sanas y con la enfermedad de Alzheimer: una investigación transcultural hispano-argentina. Fundación BBVA, 2010.

[11] M. Fernández-Martínez, J. Castro-Flores, S. Pérez-delas-Heras, A. Mandaluniz-Lekumberri, M. Gordejuela and J. Zarranz. “Prevalencia de la demencia en mayores de 65 años en una comarca del País Vasco”. Revista de Neurología, vol. 46 (2), pp. 89–96, 2008.

[12] A. Lobo, J. Ezquerra, J., F. Gómez, J.M. Sala, A. Seva. “El mini-examen cognoscitivo. Un test sencillo, práctico, para detectar alteraciones intelectivas en pacientes médicos”. Actas Luso Españolas de Neurolología, Psiquiatría y Ciencias Afines vol. 3, pp. 189–202, 1979.

[13] J. Peña-Casanova. Programa integrado de exploración neuropsicológica “test Barcelona". In Normalidad, semiología y patología neuropsicológica. Masso, Barcelona, 1991.

[14] A. Rey. Rey. Test de copia y de reproducción de memoria de figuras geométricas complejas. TEA, Madrid, 2003)

[15] S. Urbina. Claves para la evaluación con tests psicológicos (Kaufman, A.S., Kaufman, N.L (trad.)). TEA Ediciones (publicado originalmente en 2004), Madrid, 2007.

[16] M. Rincón, Sara García-Herraz, M.C. Dïaz-Mardomingo, R. Martínez-Tomás & H. Peraita. Automatic drawing analysis of figures included in neuropsychological tests for the assessment and diagnosis of mild cognitive impairment. In International Work-Conference on the Interplay Between Natural and Artificial Computation, pp. 508- 515. Springer, Cham. 2015.

[17] R.Q. Freeman. T. Giovannetti, M. Lamar, B.S. Cloud, R.A. Stern, E. Kaplan, D.J. Libon. “Visuoconstructional problems in dementia: contribution of executive systems functions”. Neuropsychology vol. 14(3), pp. 414–426, 2000.

[18] S. Arias, R. Martinez-Tomas, H. Gómez, V. Hernandez, J. Sanchez, J. Barbosa, J. Mocha. The dissociation between polarity, semantic orientation, and emotional tone as an early indicator of cognitive impairment. Frontiers in Computational Neurosciences, vol. 10 (95), 2016.

[19] H. Cole-Lewis, T. Kershaw. “Text Messaging as a Tool for Behaviour Change in Diseace Prevention and Managment.» Epidemiology, nº 10.1093, 2010.

[20] L. Naranjo, C. Pérez, Y. Campos-Roca and J. Martín. “Addresing Voice recording replications for Parkinson´s desease detection”. Expert Systems with Applications vol. 46, pp. 286-292, 2015.

[21] S. Timon, M. Rincón, R. Martínez-Tomás. “Extending XNAT Platform with an Incremental Semantic Framework”. Frontiers in Neuroinformatics. vol 11, art. 5, 2017.

[22] T. Fladby, L. Pålhaugen, P. Selnes, K. Waterloo, G. Bråthen, E. Hessen,. “Detecting at-risk alzheimer’s disease cases”. Journal of. Alzheimer’s Disease. Vol. 60(1), pp. 97-105, 2017.

Page 9: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

I Workshop de Grupos de Investigacion Espanoles de IA en Biomedicina

1273

eXiT Research Group at the University of Girona:

Artificial Intelligence and Machine Learning

Applied to Medicine and Healthcare

Beatriz Lopez, Natalia Mordvanyuck, Joaquim Massana, Ferran Torrent-Fontbona, Gerard Caceres, Carles Pous

eXiT research group

University of Girona

{beatriz.lopez,natalia.mordvanyuk}@udg.edu,[email protected],

[email protected],[email protected],[email protected]

Abstract—This paper describes the research activity of theeXiT research group at the Universitat of Girona and themain results achieved. The research is organized in four maingroups: prognosis, clinical decision support systems, mHealth,and biosignal processing. The paper describes past, current andfuture research of the group.

Index Terms—Case-based reasoning, Personalization, Patientmonitoring, Resource optimization.

I. INTRODUCTION

Research group eXiT (Control Engineering and Intelligent

Systems) focuses its work on decision support systems, case-

based reasoning, machine learning and optimization. There

are two main labs in the group and one of them, Medicine

and Healthcare lab, is dedicated to Medicine and Healthcare

and focuses on the incorporation of intelligent systems and

biomedical engineering in health management and on the

development of innovative medical technologies.

The research activity on eXiT Medicine and Healthcare

lab (eXiT for short from now on) starts around 2005, when

the Foundation Hospital Dr. Josep Trueta becomes a research

institution and synergies between the hospital’s researchers and

the University staff pop up.

As a result of eXiT’s research activity, several awards were

conceded: Best Paper in the 2014 IMIA Yearbook of Medical

Informatics for the work ‘Enabling the use of hereditary

information from pedigree tools in medical knowledge-based

systems‘, ITEA Excellence Awards 2015 for the MEDIATE

project, 2nd prize in the Vall d’Hebron Research Institute

(VHIR) Innovation Healthcare Contest for the NoaH project,

and the ITEA Vice-chairmans award for SME Success in the

MOSHCA project. Moreover, eXiT has been awarded since

1995 by the regional government with quality labels (current

label: 2017 SGR 1551). Another important outcome is the

This project has received funding from the European Unions Horizon2020 research and innovation programme under grant agreement No 689810(PEPPER). Work developed with the support of the awarded distinction by theGeneralitat de Catalunya 2017 SGR 1551, the MESC project funded by theSpanish MINECO (Ref. DPI2013-47450-C21-R), and the University of Gironagrant to supporting research groups to improve their scientific productivity(MPCUdG2016).

eXiT*CBR tool (http://exitcbr.udg.edu/), which is available to

all of the researchers.

An overview of the different application domains in which

eXiT has been working on is shown in Figure 1. In the next

sections, the applications are described and grouped according

to the problems addressed: prognosis (purple color in Fig-

ure 1), Clinical Decision Support Systems (CDSS, green),

mHealth (blue), biomedical signal processing (orange) and

healthcare services (garnet). Some future applications (black)

are commented at the end of the paper.

Fig. 1. Overview of application domains. Coloured applications describedin this paper: Purple: Prognosis; Green: Clinical Decision Support Systems;Blue: mHealth; Orange: biomedical signal processing; Garnet: Healthcareservices; Black: future research.

II. PROGNOSIS

Artificial Intelligence (AI) and Machine Learning (ML)

methods have been applied to prognosis for two risk diseases:

breast cancer and Type 2 Diabetes Mellitus (T2DM).

Page 10: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

XVIII Conferencia de la Asociacion Espanola para la Inteligencia Artificial

1274

A. Breast Cancer

We were facing the following prognosis problem: generate

a breast cancer risk model from a dataset of 871 cases (628

healthy women and 243 women with breast cancer). The core

technology used was case-based reasoning (CBR, [1], [10]).

Instead of developing a dedicated system, we opted to build

a generic tool, eXiT*CBR [15], [27], which provided some key

features for Medicine, such as the importance of experiment

repetition and the need to interact with clinicians showing the

results in plots that were easy to interpret for them.

The main techniques used include the particular methods to

implement the basic steps of the CBR methodology (i.e. simi-

larity metrics, cases selection), as well as evaluation techniques

(e.g. cross-validation), metrics (e.g precision, recall), and plots

(e.g. ROC-plots). eXiT*CBR also includes some plug-ins and

pre-processing steps, one of them related to the acquisition of

inheritance data from families, which has been awarded by the

International Medical Informatics (IMIA Best paper in 2014,

[7]).

In a second stage, the need for improving the results of

the first prototype led us to include additional features into

the second version of eXiT*CBR: multi-agent management

(enabling physicians in a joint prognosis) [23], multi-criteria

decision making (as a consequence of the first one) [14],

feature learning with the use of genetic algorithms [16], and

subgroup discovery [11].

The medical research team of such joint research are mem-

bers of the Catalan Institute of Oncology (Joan Brunet, Judith

Sanz, Teresa Ramon y Cajal). The funding received for this

research was provided by the Girona Biomedical Research

Institute (”Family history as a model for evaluation breast

cancer risk”), and the University of Girona (eXiTDM: Data

Mining Platform).

B. Type 2 Diabetes Mellitus

In this case, we dealt with the problem of building a

model for T2DM prognosis. The starting point was 1074

subjects with and without T2DM, which included both, clinical

data (IMC, sex, age) and up to 112 polymorphisms (insulin-

like growth factor binding protein, CD36 antigen, adiponectin

receptor 2, etc.).

The methods applied were dimensionality reduction with

PCA, i.e. regular simplex PCA which is adequate for discrete

data, and personalized prediction with CBR. The achieved

success ratio was around 89% with 5 principal components

[26].

The research was carried out with the endocrinologist Dr.

J.M. Fernandez-Real of the Girona Biomedical Research Insti-

tute (IdiBGi). This work was made possible by the support of

the Spanish MEC project DPI2005-08922-C02-02, the IdiBGi

project GRCT41, and DURSI AGAUR SGR 00296 (AEDS).

III. CLINICAL DECISION SUPPORT SYSTEMS

One of the first works on CDSS developed by eXiT was

for acute stroke classification. Acute strokes are medical

emergencies that require from an expert neurologists in order

to detect the illness in the appropriate therapeutic time window.

Thanks to the development of new treatments, like the rt-

Pa treatment, mortality rates have been descending in the

last decades. However, the final diagnosis of the patients is

often imprecise, and from it depends the administration of

new treatments. That is, a patient can have a diagnosis of

acute stroke, but the clinical category of it is often unknown.

This situation has been detected by the Spanish Association

of Neurologists, which has set up a repository of cases named

Badisen. Using this database, we have designed a multi-agent

case based system with the aim of giving support in the acute

stroke diagnoses. An agent in the system keeps information

of experiences in a single hospital, maintaining the particular

decision criteria employed by the main physician. Agents

collaborate in a lack of confidence in the initial diagnosis,

thanks to a trust mechanism [17]. The medical team includes

Dr. Serena from the Hospital Dr. Trueta of Girona.

The second problem addressed was a workflow-based CDSS

designed to give case-specific assessment to clinicians during

complex surgery or minimally invasive surgery. Following a

perioperative workflow, the designed software uses the CBR

methodology to retrieve similar past cases from a case base

to provide support at any particular point of the process. The

graphical user interface allows easy navigation through the

whole support progress, from the initial configuration steps

to the final results, organized as sets of experiments easily

visualized in a user-friendly way. As a result, the eXiTCDSS

tool was developed [5]. This work has been performed in the

context of the MEDIATE European Project (Eureka ITEA 2

no 09039 -TSI-020400-2010-84) with several clinical teams

involved and the key collaboration of Hospital Clinic de

Barcelona.

Finally, we are currently finalizing HTE 3.0, a CDSS

for lipid-lowering treatment and familial hypercholesterolemia

detection [2]. In that case, the main AI technique used is rule

based reasoning combined with a three-layer decision chain:

risk analysis, treatment personalization, and safety and cost-

effectiveness criteria. The work has been carried out with Dr.

A. Zamora of the Corporacio de Salut del Maresme i La

Selva. HTE 3.0 has been registered at the Intellectual Property

Register of Girona, ref UdG - Reg GI2532016 - Web CDSS

HTE 3.0., and it has been sponsored by Sanofi.

IV. MHEALTH: MONITORING AND PERSONAL

RECOMMENDATIONS

One of the key challenges in modern Medicine is to provide

individualized assessments to patients [3]. In this regard, pre-

cision Medicine is looking for genetics particularities in order

to provide the appropriate treatment to patients. However,

other ICT approaches that deal with personalized responses

to the users, taking into account their particular history and

context, are also dealing with such challenge. CBR has been

proven to be a useful technique for providing adapted and

personalized recommendations. eXiT*CBR has been used with

this purpose in two case scenario: care of premature babies at

home and insulin dose recommendations. Moreover, from the

Page 11: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

I Workshop de Grupos de Investigacion Espanoles de IA en Biomedicina

1275

collaboration with a pediatric team, a project for promoting

healthy habits in obese kids was developed.

A. Premature Babies at Home

Nowadays women give birth to healthy babies, but some-

times premature due to their advanced age during pregnancy.

That means many newborns should stay several weeks in the

hospital intensive care units until they reach a safe weight.

However, recent studies have shown that premature babies

develop faster if they stay at home, in a familiar environment,

instead of at the hospital. Thus, several hospitals have started

to discharge babies at home, in special hospital programs,

where a nurse attends once or twice per week the family by

visiting the baby at home.

In order to improve the monitoring of such babies, reduce

the stress of parents, and dedicate nursing personnel especially

to critical cases, the NoaH platform was developed (see Figure

2). The medical team involved was headed by Dr. Abel Lopez-

Bermejo (pediatric doctor of Hospital Dr. Trueta of Girona) in

cooperation with Dra. Eva Bargallo and Dra. Cristina Armero,

both neonatologists, and supported by Dra. Judith Bassols.

This work has been carried out in the context of the MOSCHA

European project (EUREKA ITEA 2 n 11027 IPT-2012-0943-

300000).

The NoaH platform has two main components: a mobile

component and a server component. The mobile part is man-

aged by the parents, and it is connected to a pulsi-oximeter

used to read vital signs of the baby. Once a day parents

place the sensor to the baby and input to a dedicated mobile

application some other information such as baby’s weight,

number of intakes, depositions, sleep, etc. Thanks to a hybrid

system composed by a rule-based system (with general medi-

cal knowledge) and a CBR system (with particular knowledge

of baby’s history), the parents receive an assessment about the

child: normal, warning, alert. This information is sent through

the NoaH platform to the clinical server, where the clinical

staff revises it.

The CBR methodology helps to avoid false positives by

considering information about the baby’s history. For example,

a rule could say that a baby should gain 10gr weight per

day, but a baby is doing around 8gr per day. If the daily

gain is consistent with her history, the system will not trigger

any warning [12]. Furthermore, in order to take advantage of

the computing capabilities of mobile phones, context aware

reasoning was also considered [24].

The NoAH platform was tested in a controlled scenario (in

hospital) with a successful feedback from users and it is in

the process of being deployed in practice. Moreover, it was

awarded in 2015 with the second prize in the Vall d’Hebron

Research Institute (VHIR) Innovation Healthcare Contest and

it has been finalist in the mHealth category of the Summer

Competition of the Universite de la Sante, Castres, France.

B. Insulin Bolus Recommender System

The burden of diabetes has led to dedicate a huge amount

of research efforts to improve the quality of life of people

that suffer the disease, as well as, to minimize the impact on

healthcare systems. In this context, the PEPPER project [9]

proposes the development of an ICT platform to follow up

people with T1DM and recommending them the insulin dose

to administer in ingests events with the use of a recommender

mobile app. The platform has two main components: a mobile

part and a server component. Analogous to NoaH, the mobile

part is addressed to citizens, in this case people with T1DM,

while the server component is designed to be managed by the

clinical staff.

The mobile component is attached to two sensors: a contin-

uous glucometer that measures the blood glucose level, and a

wrist band that provides information about the physical activity

of the user. Moreover, the outcome (insulin dose), could be

sent to an insulin pump for its infusion. Thus, there are two

scenarios considered in PEPPER: multiple daily injections

with insulin pen (MDI architecture in Figure 3), and insulin

pump (CSII architecture in Figure 3). In this latter case, after

the system recommendation is validated by the user, the dose

is automatically sent to the pump and administrated to the

user.

Regarding the recommendations provided in the mobile

component, the use of CBR enables a personal and adapted

answer to the user throughout time. PEPPER CBR method-

ology has been implemented using eXiT*CBR, however, the

following new functions were added to it to meet PEPPER

requirements: numerical reuse, revise step and maintenance

step, which includes retain, review and restore, capable of

tackling concept drift. The first task addressed was the numeric

reuse, as the solution of the CBR methodology is a numeric

parameter used to calculate the needed insulin dose [29].

Second, the CBR methodology of eXiT*CBR was extended by

including a revise step and a maintenance phase, as proposed

in [19], in order to correct proposed solutions given their

outcome and then analyze the convenience of incorporating

new experiences to the knowledge base and deleting old

ones. Thus, these steps enable the system to learn from its

continuous use.

Maintenance methods, according to [19], include two addi-

tional stages in the CBR methodology: review and restore.

In the review stage, the case base is analyzed looking for

obsolete data, according to what is known as concept drift

[6], [20]: changes in concepts. In the particular case of T1DM

management, people can suffer physiological changes so that

solutions that have been right two months ago, could not be

longer valid. As a consequence, in the restore stage, cases

no longer valid (or labeled to be discarded due some review

metrics) are removed from the case base [30].

The medical team involved in the decisions made in the

development of the new functions includes all the clinical staff

of the PEPPER project (http://www.pepper.eu.com/).

Regarding the PEPPER insulin recommender system, it has

been tested with the UVA/PADOVA simulator [21], achieving

successful results as reported in [29] and [28]. It will be

validated soon with real users in the context of the PEPPER

project.

Page 12: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

XVIII Conferencia de la Asociacion Espanola para la Inteligencia Artificial

1276

Fig. 2. Simplified architecture of NoaH from [24].

Fig. 3. Pepper architecture from [9].

C. Healthy kids

Having care of our kids today is the best bid we can make to

have a healthy society tomorrow. However, three main diseases

threaten kids’ health: obesity, diabetes and asthma. While

parents are conscious that diabetes and asthma are serious

diseases, it seems that they are less aware about the harm

of obesity. To address obesity issues, we develop MATCHuP,

a tool to enhance good nutrition and exercise habits with

the aim to reduce both, obesity and diabetes, known as

metabolic diseases. MATCHuP platform is focused on people

with metabolic diseases and their families to improve their

education in the right habits. MATCHuP recommends actions

connected to the patients’ community using collaboration

strategies to simplify input validation and competition incen-

tives, to award the user with gaming. Dr. Abel Lpez-Bermejo

from the Hospital Dr. Trueta of Girona supervised the research

[18].

V. BIOSIGNAL PROCESSING

The appearance of new wearable and non-invasive sensors

is providing new and disruptive solutions for the monitoring of

patients, disease management and treatment adherence [22].

In fact, the mHealth applications described in the previous

sections make use of wearables: NoaH uses a pulsi-oximeter,

PEPPER uses a continuous glucometer and an activity wrist-

band, and MATCHuP also uses wrist bands for monitoring

kids’ activity. However, there are new opportunities, new

sensors that could be investigated for Medicine and Healthcare.

This is the case of force sensors, for rehabilitation purposes,

and portable EEG, for epilepsy seizure detection.

First, we worked in collaboration with the Evalan company

in the context of the MOSCHA European project (EUREKA

ITEA 2 n 11027 IPT-2012-0943-300000) and the Rehabilita-

tion team of Dr. Herman R. Holstlag, Marco Raaben and Taco

J. Blokhuis. The company developed sandal shoes with force

sensors, in order to monitor patient gaits, In particular, people

who have undergone hip surgery. The gait analysis provides

Page 13: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

I Workshop de Grupos de Investigacion Espanoles de IA en Biomedicina

1277

information about the people who recovered in the expected

time and people who had a slow recovery or never returned

to a normal situation. Therefore, we dealt with a classification

problem (good versus bad recovery). The input signal, first was

processed with a bag-of-steps methodology, and then support

vector machine and CBR were used for classification purposes

[25].

A EEG wearable has been created by the MJN Neuroserveis

company (http://mjn.cat/), and eXiT is studying the application

of AI and ML techniques. This collaboration is subject to a

confidential agreement.

VI. HEALTHCARE SERVICES

In this section we have included the research related to the

management of services, as ambulance coordination, medical

equipment maintenance, and emergency attendance forecast-

ing.

A. Ambulance coordination

Ambulance services can be grouped in two main categories:

emergency and programmed services. Regarding emergency

services, they need to guarantee that persons with urgent

treatment are attended in less than 15 minutes, independently

of the distance to the nearest medical centre they are. To that

end, we developed a multi-agent system, which includes fuzzy

filters [13]. The system has been tested with the data provided

in the Girona region, in collaboration with Quim Llorella and

Rosa Prez (SEM services) who share with us the details of

the real emergency transportation service. The work has been

supported by the Spanish MEC (Ministerio de Educacin y

Ciencia) project TIN2004-06354-C02-02.

Regarding the second category, programmed patient trans-

port, the problem consists on providing, in a daily basis, a 24h

scheduling to non-urgent ambulances in regard to their task

of moving patients from home (or residences) to hospitals in

order to receive a particular treatment; and next, provide an

on-line scheduling to return them back home. To that end, we

develop an ambulance coordination software module which

has been transferred to the Lafcarr company under the name

IrisAmb.

It is interesting to highlight that the research activity devel-

oped on ambulance coordination, especially on optimization

methods, led to the creation, in 2011, of Newronia S.L., an

spin-off company (http://en.newronia.com/). It is devoted to

provide AI based solutions to optimize resources related to

transportation and logistics.

B. Medical Equipment Maintenance

Hospitals are having a growing burden on the management

of medical equipment. To support them, two works have

been carried out by eXiT: workflow monitoring for equipment

repair, and fault prediction of complex medical equipment.

First, workflow monitoring was proposed to follow-up the

equipment repair status: expensive equipment had always

the highest priority, and causes that less expensive one be

never repaired. To implement the solution, we used Petri nets

for workflow modeling and complex event processing (CEP)

for workflow monitoring to predict possible delays [8]. The

research was conducted inside the AIMES European project

(Eureka ITEA2 07017, TSI-020400-2008-047) in collabora-

tion with the University Hospital Magdeburg.

In the second case, we were dealing with a complex equip-

ment, with several components that provide individual log

reports. The aim was to predict the global, complex equipment

failures as a combination of the information provided in the

different individual logs. Therefore, we were dealing with

longitudinal data. We use sequence learning methods to learnt

sequential patterns, and next apply CBR for failure forecasting

[4]. The work has been carried out under a confidential

contract with a multinational company.

C. Emergency attendance forecasting

Hospital patient waiting times and length of stay are indi-

cators of the quality of emergency department (ED) services.

It is necessary to accurately estimate ED patient arrivals in

order to manage resources effectively. In the particular case

of a Tourist region, this estimation is difficult due to the

population variability. To that end, we collaborated with the

ED of the Hospital of Palamos, placed in a tourist region, with

a high population variability. We have tested several regression

methods for attendance forecasting at different time horizons,

and using some exogenous variables as calendar and weather

data 1.

VII. CONCLUSIONS

The eXiT research group has a long track of works in the

Biomedical and Healthcare fields. This paper summarizes the

main activities in these fields during the last years.

Future research include new application domains. First,

regarding CDSS, we have been recently awarded with a

project regarding attention-deficit hyperactivity disorder treat-

ment recommendations, which is headed by Dr. D. Serrano

and Dr. X. Castells. Second, concerning mHealth, we have

an agreement with a private company to work with patients

suffering migraines. Third, about biosignal processing, we are

initiating some research on mental diseases. And finally, in

regard to healthcare services, a PhD work is being conducted

about improving the optimization of the plans for intensity-

modulated radiotherapy.

ACKNOWLEDGMENT

Thanks to the eXiT*CBR team: Albert Pla, Pablo Gay, Jordi

Coll, Francisco Gamero, Marc Compta, Jos Antonio Manrique,

Daniel Macaya, Alejandro Pozo-Alonso. Thanks to former

members of eXiT, and in particular, to the Medicine and Health

lab.

1Publication is under review

Page 14: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

XVIII Conferencia de la Asociacion Espanola para la Inteligencia Artificial

1278

REFERENCES

[1] A. Aamodt and E. Plaza. Case-based reasoning: Foundational issues,methodological variations, and system approaches. AI communications,1994.

[2] Z. Alberto, F. Torrent-Fontbona, B. Lopez, E. Feliu, C. Carrion,M. Aymerich, X. Castells, L. Blanco, A. Martın-Urda, A. Pozo-Alonso, and C. Capella. Desarrollo, validacion en pacientes de altoriesgo vascular y evaluacion de un CDSS de ayuda en el tratamientohipolipemiante (HTE-DLPR). Revista Espanola de Comunicacion en

Salud, Suplemento:17–18, 2016.

[3] E. Capobianco. Ten challenges for systems medicine. Frontiers in

genetics, 3:193, 2012.

[4] M. Compta and B. Lopez. Integration of Sequence Learning and CBRfor Complex Equipment Failure Prediction. In Ram A., Wiratunga N.

(eds) Case-Based Reasoning Research and Development. ICCBR 2011.

Lecture Notes in Computer Science, vol 6880., pages 408–422. Springer,Berlin, Heidelberg, 2011.

[5] A. El-Fakdi, F. Gamero, J. Melendez, V. Auffret, and P. Haigron.eXiTCDSS: A framework for a workflow-based CBR for interventionalClinical Decision Support Systems and its application to TAVI. Expert

Systems with Applications, 41(2):284–294, feb 2014.

[6] J. Gama, I. Zliobait, A. Bifet, M. Pechenizkiy, and A. Bouchachia. Asurvey on concept drift adaptation. ACM Computing Surveys, 46(4):1–37, mar 2014.

[7] P. Gay, B. Lopez, A. Pla, J. Saperas, and C. Pous. Enabling the use ofhereditary information from pedigree tools in medical knowledge-basedsystems. Journal of Biomedical Informatics, 46(4):710–720, aug 2013.

[8] P. Gay, A. Pla, B. Lopez, J. Melendez, and R. Meunier. Service workflowmonitoring through complex event processing. In Proceedings of the

15th IEEE International Conference on Emerging Technologies and

Factory Automation, ETFA 2010, 2010.

[9] P. Herrero, B. Lopez, and C. Martin. PEPPER: Patient EmpowermentThrough Predictive Personalised Decision Support. In Proc. ECAI

Workshop on Artificial Intelligence for Diabetes, pages 8–9, 2016.

[10] B. Lopez. Case-based reasoning: A concise introduction, volume 20.Morgan & Claypool Publishers, Synthesis Lectures on Artificial Intelli-gence and Machine Learning, 2013.

[11] B. Lopez, V. Barrera, J. Melendez, C. Pous, J. Brunet, and J. Sanz.Subgroup Discovery for Weight Learning in Breast Cancer Diagnosis.In Combi C., Shahar Y., Abu-Hanna A. (eds) Artificial Intelligence in

Medicine. AIME 2009. Lecture Notes in Computer Science, vol 5651,pages 360–364. Springer, Berlin, Heidelberg, 2009.

[12] B. Lopez, J. Coll, F.-I. Gamero, E. Bargallo, and A. Lopez-Bermejo.Intelligent systems for supporting premature babies healthcare withmobile devices. In Mobilmed, 2013.

[13] B. Lopez, B. Innocenti, and D. Busquets. A multiagent systemfor coordinating ambulances for emergency medical services. IEEE

Intelligent Systems, 23(5), 2008.

[14] B. Lopez, C. Pous, P. Gay, and A. Pla. Multi Criteria Decision Methodsfor Coordinating Case-Based Agents. In L. Braubach et al. (Eds.):

MATES 2009, LNAI 5774, pages 54–65. Springer, Berlin, Heidelberg,2009.

[15] B. Lopez, C. Pous, P. Gay, A. Pla, J. Sanz, and J. Brunet. eXiT*CBR:A framework for case-based medical diagnosis development and exper-imentation. Artificial Intelligence in Medicine, 51(2):81–91, feb 2011.

[16] B. Lopez, C. Pous, A. Pla, and P. Gay. Boosting CBR Agents withGenetic Algorithms. In In: McGinty, L., Wilson, D.C. (eds.) ICCBR

2009. LNCS (LNAI), pages 195–209. Springer, Berlin, Heidelberg, 2009.

[17] B. Lopez, C. Pous, J. Serena, and J. Piula. Cooperative case-basedagents for acute stroke diagnosis. In ECAI Workshop on Agents Applied

in Health Care, Riva di Garda, Italia, pages 21–28, 2006.

[18] B. Lopez, S. Soung, N. Mordvanyuk, A. Pla, P. Gay, and A. Lopez-Bermejo. MATCHuP: An mHealth Tool for Children and Young PeopleHealth Promotion. In Proceedings of the 10th International Joint

Conference on Biomedical Engineering Systems and Technologies -

Volume 5: HEALTHINF, 313-318, 2017, Porto, Portugal, pages 313–318. SCITEPRESS, 2017.

[19] R. Lopez de Mantaras, D. McSherry, D. Bridge, D. Leake, B. Smyth,S. Craw, B. Faltings, M. L. Maher, M. T. Cox, K. Forbus, M. Keane,A. Aamodt, and I. Watson. Retrieval, reuse, revision and retention incase-based reasoning. The Knowledge Engineering Review, 20(03):215,sep 2005.

[20] N. Lu, G. Zhang, and J. Lu. Concept drift detection via competencemodels. Artificial Intelligence, 209:11–28, apr 2014.

[21] C. D. Man, F. Micheletto, D. Lv, M. Breton, B. Kovatchev, andC. Cobelli. The UVA/PADOVA Type 1 Diabetes Simulator: NewFeatures. Journal of diabetes science and technology, 8(1):26–34, jan2014.

[22] A. Pantelopoulos and N. G. Bourbakis. A survey on wearable sensor-based systems for health monitoring and prognosis. Systems, Man, and

Cybernetics, Part C: Applications and Reviews, IEEE Transactions on,40(1):1–12, 2010.

[23] A. Pla, B. Lopez, P. Gay, and C. Pous. eXiT*CBR.v2: Distributed case-based reasoning tool for medical prognosis. Decision Support Systems,54(3):1499–1510, feb 2013.

[24] A. Pla, B. Lopez, N. Mordvaniuk, C. Armero, and A. Lopez-Bermejo.Context Management in Health Care Apps. In AAMAS Workshop A2HC

Agents Applied in Health Care, page 8 pages, 2015.[25] A. Pla, N. Mordvanyuk, B. Lopez, M. Raaben, T. Blokhuis, and

H. Holstlag. Bag-of-steps: Predicting lower-limb fracture rehabilitationlength by weight loading analysis. Neurocomputing, 268, 2017.

[26] C. Pous, D. Caballero, and B. Lopez. Diagnosing patients with acombination of principal component analysis and case based reasoning.International Journal of Hybrid Intelligent Systems, 6(2):111–122, may2009.

[27] C. Pous, P. Gay, A. Pla, J. Brunet, J. Sanz, T. R. y. Cajal, and B. Lopez.Modeling Reuse on Case-Based Reasoning with Application to BreastCancer Diagnosis. In In: Dochev D., Pistore M., Traverso P. (eds)

Artificial Intelligence: Methodology, Systems, and Applications. AIMSA

2008. Lecture Notes in Computer Science, vol 5253, pages 322–332.Springer Berlin Heidelberg, Berlin, Heidelberg, 2008.

[28] F. Torrent-Fontbona. Adaptive basal insulin recommender system basedon Kalman filter for type 1 diabetes. Expert Systems with Applications,101:1–7, jul 2018.

[29] F. Torrent-Fontbona and B. Lopez. Personalised Adaptive CBR BolusRecommender System for Type 1 Diabetes. IEEE Journal of Biomedical

and Health Informatics, pages 1–1, 2018.[30] F. Torrent-Fontbona, J. Massana, and B. Lopez. Case-base maintenance

of a personalised bolus insulin recommender system for Type 1 DiabetesMellitus. In Proceedings of Joint Workshop on Artificial Intelligence

in Health (AIH2018), Stockholm, pages 22–32. http://ceur-ws.org/Vol-2142/, 2018.

Page 15: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

I Workshop de Grupos de Investigacion Espanoles de IA en Biomedicina

1279

Research on medical decision analysis

at the CISIAD, UNED

Francisco Javier Dıez Manuel Luque Jorge Perez-Martın Manuel Arias

Dept. Artificial Intelligence

Universidad Nacional de Educacion a Distancia (UNED)

Madrid, Spain

{fjdiez, mluque, jperezmartin, marias}@dia.uned.es

Abstract—The Research Centre for Intelligent Decision-Support Systems (CISIAD) has been doing research on prob-abilistic graphical models applied to medicine for almost threedecades. In this paper we summarise the contributions we havemade, analyse the main difficulties we have found, and presentthe main failures and successes we have had in those years.

Index Terms—artificial intelligence, probabilistic graphicalmodel, Bayesian network, influence diagram, Markov model,medical decision making, cost-effectiveness analysis

I. INTRODUCTION

One of the features of artificial intelligent systems is the

ability to draw conclusions in uncertain domains. In medicine

uncertainty is ubiquitous, mainly due to limited knowledge

about the causal mechanisms and to the non-determinism of

the real world. For this reason medicine has been one of

the main fields of application since the beginning of artificial

intelligence. The first methods for reasoning with uncertainty

were based on the theory of probability, more specifically on

what we now call the naıve Bayes method, which relies on

two assumptions: diseases are mutually exclusive and findings

are conditionally independent given the diagnosis. With these

simplifying hypotheses is was possible to build several models

that succeeded in solving several diagnostic problems in the

1960’s and 1970’s [1]–[6]. However, the assumptions required

by this method are usually unrealistic in practice, which led

many researchers to assert that probabilistic methods could

not be used to solve large AI problems—see [7], [8] for a

discussion.

The situation changed significantly in the next decade with

the advent of probabilistic graphical models (PGMs). Howard

and Matheson, two economists of the Stanford Research

Institute (SRI) developed influence diagrams as a compact

representation of decision problems, alternative to decision

trees [9], and Judea Pearl, an artificial intelligence researcher

at UCLA, developed Bayesian networks as an extension of

the naıve Bayes [10], [11]. Very soon other authors proposed

efficient algorithms for the evaluation of influence diagrams

[12], [13] and Bayesian networks [14]. The first PGMs for

real-world medical problems were developed in the next years

[15]–[18] and the number of applications has grown so fast

This work has been supported by grant TIN2016-77206-R of the SpanishGovernment, co-financed by the European Regional Development Fund.J.P. received a predoctoral grant from the Spanish Ministry of Education.

afterwards that now it is impossible to have a registry of all

the medical applications that use PGMs.

In this paper we review some of the applications devel-

oped at the Centre for Intelligent Decision-Support Systems

(CISIAD) of the National University for Distance Education

(UNED), in Madrid, Spain, summarise the contributions we

have made, analyse the main difficulties we have found, and

present the main failures and successes we have had in almost

three decades of research.

II. PROBABILISTIC MODELS FOR MEDICAL PROBLEMS

A. PGMs for medical diagnosis

In 1989 Javier Dıez began a doctoral thesis in artificial intel-

ligence for medicine under the supervision of Prof. Jose Mira

at UNED. The topic was the construction of an expert system

for echocardiography, in collaboration with some doctors of

the Hospital de la Princesa, in Madrid. In those years most

expert systems were built using rules, and fuzzy logic was

more and more popular. Prof. Mira had supervised several

PhD theses that had applied these techniques to different

medical problems [19]–[22], so they seemed to be the obvious

choice for Dıez’s thesis. However, in the first knowledge

elicitation sessions one of the doctors proposed building a

causal network: mitral stenosis causes left atrium hypertension,

which back-propagates to the lungs, and so on. However, when

Dıez tried to encode this causal model into a set of rules, it was

impossible, because a piece of knowledge such as “A causes

B” can be used either to infer B from A, or vice versa; but

rule-based reasoning is unidirectional. Additionally, when A

and B are causes of C and C is observed, the presence of A

rules out B (this phenomenon is called explaining away [10])

and, conversely, discarding A increases the suspicion that B

has caused C. Due to these limitations of rule-based reasoning

and to his training as a physicist, Dıez began exploring proba-

bilistic reasoning for causal models, without being aware of the

landmark contributions made by Pearl a few years earlier [10],

[11], [23], [24]. He then rediscovered Bayesian networks, the

noisy OR gate and its generalization to multivalued variables

[25], which he called the noisy MAX [26], [27], and developed

a new algorithm for evidence propagation [28]. In 1992 he

spent three months at UCLA invited by Judea Pearl, and was

able to catch up with the avant-garde of the research in this

field, which was led by Pearl’s group.

Page 16: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

XVIII Conferencia de la Asociacion Espanola para la Inteligencia Artificial

1280

After finishing his doctorate in 1994 [29], Dıez super-

vised several PhD theses that built Bayesian networks for

several problems: Carmen Lacave [30], [31] built Prostanet

for urology, Severino Fernandez Galan [32] built Nasonet for

nasopharyngeal cancer and Nuria Alonso Santander [33], an

ophthalmologist of Hospital de la Princesa (Madrid), built

Catarnet for cataract surgery.

B. PGMs for decision analysis

Because of our contact with health professionals, we re-

alised that in medicine the final goal is not to issue a diagnosis,

but to make decisions. In many cases finding out the disease

with the higher probability or obtaining a list of variables

whose posterior probability exceeds a certain threshold is not

enough, because very often a low probability of a serious

disease is more relevant that a high probability for a relatively-

benign disease. In fact, newspapers from time to time tell the

story of a patient who died after presenting at the urgency

room of a hospital and being discharged because the doctors

just gave the most likely diagnosis, without taking into account

that the symptoms were compatible with an infrequent but

mortal disease. Clearly, in medicine a false positive and a

false negative have very different costs: the former usually

leads to performing additional tests and sometimes starting

an unnecessary treatment, which has an economic cost and

may cause anxiety and discomfort to the patient, but a false

negative may lead to delaying a treatment necessary to save

the patient’s life.

For this reason we were interested in building models

that explicitly took into account the decisions and the cost

of tests and treatments, including the cost of giving the

wrong treatment or no treatment. This is how we started

investigating influence diagrams (IDs), which differ from

Bayesian networks in that they do not only have chance nodes,

but also decision and utility nodes [9]. During his doctoral

work, Manuel Luque built Mediastinet, an influence diagram

for the mediastinal staging of non-small cell lung cancer,

in collaboration with Dr. Carlos Disdier, of Hospital San

Pedro de Alcantara (Caceres) [34], [35], and Diego Leon built

Arthronet, an influence diagram for total knee arthroplasty,

during his master thesis, in collaboration with a doctor of

Valladolid [36]. Every influence diagram is equivalent to a

decision tree, but IDs have the advantage of being much more

compact and, consequently, much easier to build and modify.

In particular, both Mediastinet and Arthronet are equivalent

to decision trees containing more than 10,000 leaves, and

there is an algorithm that can transform each of them into the

corresponding tree—provided that the computer has enough

working memory.

In these models the main criterion was clinical effectiveness,

measured in quality adjusted life years (QALYs), but they

also represented the economic cost. In health economics the

standard way of combining cost and effectiveness into a single

criterion is to compute the net monetary benefit,

NMB = λ · e− c ,

or, alternatively, the net health benefit,

NHB = λ−1 ·NMB = e− λ−1 · c ,

where e is the effectiveness (measured in QALYs or in

another clinical unit), c is the cost (in monetary units, such

as dollars or euros) and λ, known as willingness to pay

or cost effectiveness threshold, is a parameter that converts

effectiveness into monetary units [37], [38]. This way it was

possible to evaluate Mediastinet taking into account just the

effectiveness (i.e., disregarding the cost, by making λ−1= 0)

or for specific values of λ. The problem is that λ is different

for each decision maker. In Spain most health economists

accept λ = 30, 000 C/QALY as a value of consensus for

our public health system, but there are also some experts

claiming that this value is too high. Therefore, it would be

desirable to perform a true cost-effectiveness analysis (CEA)

in order to find out the values of λ (the thresholds) that

determine the most beneficial intervention for each decision

maker. Unfortunately, the algorithms available ten years ago

were only able to perform CEA for decision trees containing

just one decision node, at the root, and both Mediastinet and

Arthronet had several decisions.

For this reason our group first developed a CEA algorithm

for trees with several decisions [39] and then for IDs [40].

After implemented them in OpenMarkov, an open-source soft-

ware tool that we describe below, it was possible to evaluate

these IDs in a few seconds.

However, many medical problems involve asymmetries of

several types. There is order asymmetry when the decisions are

not totally ordered; for example, when it is not clear which test

to do first, if any, and what tests to do afterwards depending

on the result of previous tests. There may be information

asymmetry due conditional observability; for example, the

result of a test is know only when the doctor decides to

perform it. And there is domain asymmetry when the value of

one variable restricts the values of others; for example, when

the decision is not to do a test, its result is neither positive nor

negative. In IDs the second and third types of asymmetry can

be represented—clumsily—by adding dummy states to some

variables, but order asymmetry cannot be represented because

IDs require a total ordering of the decisions. With the purpose

of overcoming these limitations we proposed decision analysis

networks (DANs) [41] and developed a CEA algorithm for

them [42].

C. PGMs for temporal reasoning

In the same way as our collaboration with medical doctors

led us from diagnosis (with Bayesian networks) to unicriterion

decision analysis and then to cost-effectiveness analysis, it

also showed us the importance of temporal reasoning. Our

group had proposed two new types of temporal PGMs, namely

networks of probabilistic events in discrete time (NPEDTs)

[32] and dynamic limited memory influence diagrams (DLIM-

IDs) [43]. The former were developed originally to model

the spread of nasopharyngeal cancer [44] and the latter to

predict the progression of carcinoid tumours [45]. For different

Page 17: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

I Workshop de Grupos de Investigacion Espanoles de IA en Biomedicina

1281

reasons none of these types of networks could solve the typical

problems in which health economists are currently interested.

Therefore we extended our work on IDs to develop Markov

IDs [46]. With them we have been able to conduct CEAs

for several medical problems, such as pleural effusion [47],

colorectal cancer [47], cochlear implantation [48], etc.

III. DEVELOPMENT OF OPEN-SOURCE TOOLS

These medical applications have been built with two open-

source tools: Elvira and OpenMarkov. Elvira was the result

of a collaboration of several Spanish universities, mainly

Granada, Almerıa, Castilla-La Mancha, Paıs Vasco and UNED

[49]. The high number of developers, with experienced re-

searchers, was the main reason for its rapid development,

with the implementation of many algorithms for inference and

learning, but, on the other hand, the physical distance between

the teams involved in the project and the lack of adherence to

the principles of software engineering made the tool difficult

to maintain.

For this reason the UNED started the development of a

new tool, called OpenMarkov [50]. We departed from the

experience gained with Elvira, but the all the code was written

from scratch. It is now a large tool, with around 115,000 lines

of Java code, excluding blanks, and more than 200,000 lines

in total, organised in 44 maven projects. The fact that it is

managed by a single team and the use of several software

engineering tools (git, maven, nexus, jenkins, etc.) has allowed

us to make several redesign decisions and to maintain the code,

which is still growing actively.

Both Elvira and OpenMarkov have advanced graphical user

interfaces for editing and evaluating PGMs. Elvira has many

learning algorithms, while OpenMarkov only implements the

two basic algorithms, namely search-and-score and PC, but in

general it is much more robust and more efficient in inference,

and offers more types of networks (Markov IDs, DANs, etc.),

more options for sensitivity analysis and temporal models,

CEA, and the possibility of learning Bayesian networks in-

teractively [51].

To our knowledge, Elvira was used in 10 countries, almost

exclusively at universities, while OpenMarkov has been used

for teaching, research and developing applications at universi-

ties, research centres and companies of more than 30 countries

in Europe, America, Asia and Africa.

IV. DIFFICULTIES, FAILURES AND SUCCESSES

In this section we analyse some of the difficulties we have

found when building artificial intelligence applications for

medicine, which range from technical challenges to human

factors, and describe the main failures and successes we have

faced.

A. Building PGMs with expert knowledge

Our group differs from most others in the field of PGMs

in that, instead of investigating new learning algorithms, we

have specialised in building PGMs with expert knowledge.

This process is time consuming and, what makes it much

more challenging, requires in general the collaboration of

medical doctors. None of the health professionals who have

collaborated with us has ever received any economic com-

pensation for their work. Some of them have collaborated

actively, but others had a low degree of commitment, to the

point that it was difficult for us to arrange the meetings with

them. For this reason, some of our attempts to build models

for medical problems have failed after having investing a

significant amount of time and effort.

B. Use of PGMs for clinical decisions

Clinical decision-support systems can be used in at least

two ways. One of them is to guide the diagnosis and the

treatment of individual patients at the clinical consultation or at

the bedside. Many expert systems have been designed for this

purpose, including our first PGMs. However, we do not know

of any AI system routinely used this way. We were close to

succeed with Catarnet, the above-mentioned Bayesian network

for cataract surgery. The Health Department of the regional

government of Madrid, who had financed the project, was

interested in implanting this system into the new big hospitals

it manages. We collaborated with the technicians of one of

them to design a protocol for integrating Catarnet into their

information system. When we were just about to start the tests,

there was a change in the leadership of the hospital and the

new person responsible refused to implant the decision-support

system unless he could obtain from it some benefit for his

professional/political career.

AI might also be applied to developing public health poli-

cies. However, these policies are based, in the best scenarios,

on epidemiological studies, economic evaluations of health

technologies and the consensus of experts; there seems to be

no room for expert systems. However, in our group we have

combined PGMs, an AI technique, with cost-effectiveness

analysis, as mentioned above. One of the models we have

built is a Markov ID for analysing the cost-effectiveness of

paediatric bilateral cochlear implantation (BCI), i.e., for deter-

mining whether it is worth putting two implants instead of one

to babies who are born with severe to profound deafness. The

preliminary study we conducted, which included a thorough

review of the literature, contributed to convincing the Ministry

of Health that it is cost-effective, and Spain became the

first country in the world—to the best of our knowledge—to

include BCI for both children and adults in the portfolio of

health services (cf. Orden SSI/1356/2015, de 2 de julio). In

spite of this law, several regional governments still refuse to

cover it in practice, even for newborns. We wrote a detailed

report, based on our cost-effectiveness analysis [48], which

proved beyond any reasonable doubt that this intervention

is clearly cost-effective for children, and submitted it to the

Ministry of Health and to 11 regional health departments. In

May 2018 the Ministry of Health sent a letter to F. J. Dıez in

which it explicitly rectified its previous stance and confirmed

that BCI must be covered by all health providers in Spain.

More recently, Catalonia and Andalusia, two of the regions

that had steadfastly refused covering it have announced that

Page 18: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

XVIII Conferencia de la Asociacion Espanola para la Inteligencia Artificial

1282

they will start putting two implants to the children that need

them. This is the first time that our research on medical AI

has had an impact on the life of patients.

We are working on two models for finding the optimal

screening patterns for breast cancer and colorectal cancer.

Even though there are several studies about these topics, we

intend to develop new models and new algorithms for finding

the optimal screening pattern for each patient based on his/her

personal features. We will soon begin a CEA of screening for

cytomegalovirus in newborns; if our study concludes that it is

cost-effective, as some experts have recently claimed, health

authorities should include it in the battery of tests for neonatal

screening, which would have an impact on the life of many

children and families.

C. A probabilistic expert system for programming cochlear

implants

Our interest for cochlear implants led us to contact Dr. Paul

Govaerts, who had been investigating the application of AI to

programming them. This is a difficult task, because an implant

has more than 100 electronic parameters that can be fitted.

Improving the quality of hearing in one setting (for example,

in a quiet room) may deteriorate the hearing in others (for

example, in a noisy street). He had built a rule-based system

that improved the performance of human audiologists, but the

results were far from impressive. For this reason he started

a new project, financed by a EU grant, aimed at building

a new version of the expert system. A prestigious research

group specialised in machine learning joined the project as a

partner. However, some technical problems made it impossible

to obtain the data they counted on, and even with them it

would have been virtually impossible to build a model using

learning algorithms, due to the complexity of the task. Seeing

that the project had run aground, Dr. Govaerts contacted our

group. The combination of his knowledge of audiology and

cochlear implant technology with our expertise in building

PGMs from human knowledge made it possible to create a

probabilistic model based on a causal graph and subjective

estimates of the probabilities [47], [52], [53]. In a few months

it gave the first useful results and two years later impressed

some experts in Europe and the USA for its performance. The

main manufacturer of cochlear implants, who has a market

share of more than 50%, has bought the rights to exploit it

exclusively.

V. CONCLUSIONS AND FUTURE WORK

Our group has been doing research on artificial intelligence

applied to health decision making for almost three decades.

We have contributed new algorithms [28], [39], [40], [42],

[50], [54]–[61], new types of probabilistic graphical models

(NPEDTs [32], DLIMIDs [43], [45], tuning networks [52],

Markov IDs [46], DANs [41]), new canonical models [26],

[27], [52] and several methods for the explanation of reasoning

[62]–[66]. Each of them was motivated by a specific medical

problem for which we were building a probabilistic network,

but all of them can be applied to other domains. Similarly,

the software tools we have developed [49], [50] are designed

mainly for medicine, but other groups have used them to build

applications in very different fields. These software tools have

also been very useful for teaching PGMs to our students [67],

[68].

Looking retrospectively, we can see that our efforts to

build decision-support systems for clinical consultation have

failed far from obtaining the benefits we expected. Building a

probabilistic model manually takes a lot of time and requires

the commitment of medical doctors, who in some cases col-

laborate enthusiastically but in others are poorly motivated and

abandon the project far before arriving at the goal. Similarly,

we have invested lots of time in developing software tools

with advanced graphical user interfaces, in spite of our scarcity

of funding and human resources. These tools have been very

useful for our research and teaching, and also for many other

universities in four continents. Several institutes and large

companies of different countries have used OpenMarkov to

build real-world applications. This has brought us the personal

satisfaction of having offered the AI community a useful tool,

but so far we have not obtained any economic return from

it, and in the academic world, governed by the “publish or

perish” principle, it is a risk to devote much time to tasks that

yield poor results in terms of journal papers. Sometimes we

ask ourselves if we made a mistake by following these lines of

research instead of working on other areas, such as machine

learning, in which the productivity is much higher.

Nonetheless, our research has also brought us other rewards.

We have been pioneers in the application of AI to cost-

effectiveness analysis, which is more and more relevant for

medical decision making. Our economic study of cochlear

implantation has contributed to convincing the Spanish health

authorities that profoundly deaf people should receive two

cochlear implants instead of one, especially in the case of

children. Our experience in building probabilistic models from

human knowledge and our software tool, OpenMarkov, had

been essential in the construction of an expert system that

is routinely used for programming cochlear implants; given

that there are hundreds of thousands of cochlear implant

users in the world, we are happy to know that our work

will contribute to improving the quality of life of so many

people. This tool is superior in several aspects to the com-

mercial products developed for this task—and also inferior

in others, clearly—and even though it is open-source, there

are several possibilities of obtaining monetary returns from

it: distributing it under dual-licensing, offering consultancy

(mainly to pharmaceutical companies and manufacturers of

medical devices), doing under-contract developments, etc. We

are currently exploring these possibilities in order to obtain

financial resources for our research activity.

Acknowledgements

We thank our colleagues and friends who collaborated in

the Elvira project, from whom we learnt so much, and all

the students and former members of our group who have col-

laborated in the development of OpenMarkov. We also thank

Page 19: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

I Workshop de Grupos de Investigacion Espanoles de IA en Biomedicina

1283

the reviewers of the First Workshop of Spanish AI Research

Groups in Biomedicine, at the CAEPIA-2018 conference, for

their corrections and suggestions.

REFERENCES

[1] F. T. de Dombal, J. R. Leaper, J. R. Staniland, A. McCann, andJ. Horrocks, “Computer-aided diagnosis of acute abdominal pain,”British Medical Journal, vol. 2, pp. 9–13, 1972.

[2] G. A. Gorry, “Computer-assisted clinical decision making,” Methods of

Information in Medicine, vol. 12, pp. 45–51, 1973.

[3] G. A. Gorry and G. O. Barnett, “Experience with a model of sequentialdiagnosis,” Computers and Biomedical Research, vol. 1, pp. 490–507,1968.

[4] J. Habbema, “Models for diagnosis and detection of combinations ofdiseases,” in Decision Making and Medical Care, F. de Dombal andF. Gremy, Eds. New York: North-Holland, 1976, pp. 399–411.

[5] H. R. Warner, A. F. Toronto, and L. G. Veasy, “Experience with Bayes’theorem for computer diagnosis of congenital heart disease,” Annals of

the New York Academy of Sciences, vol. 115, pp. 558–567, 1964.

[6] H. R. Warner, A. F. Toronto, L. G. Veasy, and R. Stephenson, “Amathematical approach to medical diagnosis: Application to congenitalheart disease,” Journal of the American Medical Association, vol. 177,pp. 177–183, 1961.

[7] P. Szolovits, Ed., Artificial Intelligence in Medicine. Boulder, CO:Westview Press, 1982.

[8] P. Szolovits and S. G. Pauker, “Categorical and probabilistic reasoningin medicine,” Artificial Intelligence, vol. 11, pp. 115–144, 1978.

[9] R. A. Howard and J. E. Matheson, “Influence diagrams,” in Readings

on the Principles and Applications of Decision Analysis, R. A. Howardand J. E. Matheson, Eds. Menlo Park, CA: Strategic Decisions Group,1984, pp. 719–762.

[10] J. Pearl, “Fusion, propagation and structuring in belief networks,”Artificial Intelligence, vol. 29, pp. 241–288, 1986.

[11] ——, Probabilistic Reasoning in Intelligent Systems: Networks of Plau-

sible Inference. San Mateo, CA: Morgan Kaufmann, 1988.

[12] S. M. Olmsted, “On representing and solving decision problems,” Ph.D.dissertation, Dept. Engineering-Economic Systems, Stanford University,CA, 1983.

[13] R. D. Shachter, “Evaluating influence diagrams,” Operations Research,vol. 34, pp. 871–882, 1986.

[14] S. L. Lauritzen and D. J. Spiegelhalter, “Local computations withprobabilities on graphical structures and their application to expertsystems,” Journal of the Royal Statistical Society, Series B, vol. 50,pp. 157–224, 1988.

[15] S. Andreassen, M. Woldby, B. Falck, and S. K. Andersen, “MUNIN —a causal probabilistic network for interpretation of electromyographicfindings,” in Proceedings of the Tenth International Joint Conference on

Artificial Intelligence (IJCAI’87), J. P. McDermott, Ed. Milan, Italy:Morgan Kaufmann, 1987, pp. 366–372.

[16] R. F. Nease and K. D. K. Owens, “Use of influence diagrams to structuremedical decisions,” Medical Decision Making, vol. 17, pp. 263–275,1997.

[17] K. G. Olesen, U. Kjærulff, F. Jensen, F. V. Jensen, B. Falck, S. An-dreassen, and S. K. Andersen, “A MUNIN network for the median nerve.a case study on loops,” Applied Artificial Intelligence, vol. 3, pp. 385–403, 1989.

[18] D. K. Owens, R. D. Shachter, and R. F. Nease, “Representation andanalysis of medical decision problems with influence diagrams,” Medical

Decision Making, vol. 17, pp. 241–262, 1997.

[19] R. Marın, “Un sistema experto para el diagnostico y tratamiento an-teparto del estado materno-fetal,” Ph.D. dissertation, Universidad deSantiago, Santiago de Compostela, 1987.

[20] R. Marın, M. Taboada, J. Mira, A. Barreiro, and R. P. Otero, “Designand integration of a graphic interface for an expert system in oncology,”International Journal of Biomedical Computing, vol. 33, pp. 25–43,1993.

[21] R. P. Otero, “Medtool: Una herramienta para el desarrollo de sistemasexpertos,” Ph.D. dissertation, Universidad de Santiago, Santiago deCompostela, 1991.

[22] R. P. Otero and J. Mira, “Medtool: A teachable medical expert systemdevelopment tool,” in Proceedings of the Third International Symposium

on Knowledge Engineering, Madrid, Spain, 1988, pp. 191–200.

[23] J. Pearl, “Reverend Bayes on inference engines: A distributed hierar-chical approach,” in Proceedings of the Second National Conference

on Artificial Intelligence (AAAI’82), D. L. Waltz, Ed. Pittsburgh, PA:AAAI Press, 1982, pp. 133–136.

[24] J. H. Kim and J. Pearl, “A computational model for combined causal anddiagnostic reasoning in inference systems,” in Proceedings of the Eighth

International Joint Conference on Artificial Intelligence (IJCAI’83),A. Bundy, Ed. Karlsruhe, Germany: William Kauffmann, 1983, pp.190–193.

[25] M. Henrion, “Some practical issues in constructing belief networks,” inUncertainty in Artificial Intelligence 5 (UAI’89), M. Henrion, R. D.Shachter, L. N. Kanal, and J. F. Lemmer, Eds. Amsterdam, TheNetherlands: Elsevier Science Publishers, 1989, pp. 161–173.

[26] F. J. Dıez, “Parameter adjustment in Bayes networks. The generalizednoisy OR–gate,” in Proceedings of the Ninth Conference on Uncertainty

in Artificial Intelligence (UAI’93), D. Heckermann and E. Mamdani, Eds.San Francisco, CA: Morgan Kaufmann, 1993, pp. 99–105.

[27] F. J. Dıez and M. J. Druzdzel, “Canonical probabilistic models forknowledge engineering,” UNED, Madrid, Spain, Technical ReportCISIAD-06-01, 2006.

[28] F. J. Dıez, “Local conditioning in Bayesian networks,” Artificial Intelli-

gence, vol. 87, pp. 1–20, 1996.

[29] ——, “Sistema experto bayesiano para ecocardiografıa,” Ph.D. disserta-tion, Dpto. Informatica y Automatica, UNED, Madrid, 1994, in Spanish.

[30] C. Lacave and F. J. Dıez, “Knowledge acquisition in Prostanet, aBayesian network for diagnosing prostate cancer,” Lecture Notes in

Computer Science, vol. 2774, pp. 1345–1350, 2003.

[31] C. Lacave, “Explicacion en redes bayesianas causales. aplicacionesmedicas,” Ph.D. dissertation, Dept. Inteligencia Artificial. UNED,Madrid, 2003.

[32] S. F. Galan and F. J. Dıez, “Networks of probabilistic events in discretetime,” International Journal of Approximate Reasoning, vol. 30, pp.181–202, 2002.

[33] N. Alonso-Santander, F. J. Dıez, N. F. de Larrea, C. Margalef, andM. Arias, “A probabilistic decision-support system for cataract surgery,”in XXVIII Congress of the ESCRS (European Society of Cataract and

Refractive Surgery), Paris, France, 2010.

[34] M. Luque, “Probabilistic graphical models for decision making inmedicine,” Ph.D. dissertation, UNED, Madrid, 2009.

[35] M. Luque, F. J. Dıez, and C. Disdier, “Optimal sequence of tests forthe mediastinal staging of non-small cell lung cancer,” BMC Medical

Informatics and Decision Making, vol. 16, pp. 1–14, 2016.

[36] D. Leon, “A probabilistic graphical model for total knee arthroplasty,”Master’s thesis, Dept. Artificial Intelligence, UNED, Madrid, Spain,2011.

[37] M. F. Drummond, M. J. Sculpher, K. Claxton, G. L. Stoddart, andG. Torrance, Methods for the Economic Evaluation of Health Care

Programmes, cuarta ed. Oxford, UK: Oxford University Press, 2015.

[38] A. A. Stinnett and J. Mullahy, “Net health benefit: A new frameworkfor the analysis of uncertainty in cost-effectiveness analysis,” Medical

Decision Making, vol. 18, pp. S68–S80, 1998.

[39] M. Arias and F. J. Dıez, “Cost-effectiveness analysis with sequentialdecisions,” UNED, Madrid, Spain, Technical Report CISIAD-11-01,2011.

[40] ——, “Cost-effectiveness analysis with influence diagrams,” Methods of

Information in Medicine, vol. 54, pp. 353–358, 2015.

[41] F. J. Dıez, M. Luque, and I. Bermejo, “Decision analysis networks,”International Journal of Approximate Reasoning, vol. 96, pp. 1–17,2018.

[42] M. Arias, M. Luque, J. Perez-Martın, and F. J. Dıez, “Cost-effectivenessanalysis with decision analysis networks,” in Annual Meeting of the

Society for Medical Decision Making, Pittsburgh, PA, 2017.

[43] F. J. Dıez and M. A. J. van Gerven, “Dynamic LIMIDs,” in Decision

Theory Models for Applications in Artificial Intelligence: Concepts and

Solutions, L. E. Sucar, J. Hoey, and E. Morales, Eds. Hershey, PA: IGIGlobal, 2011, pp. 164–189.

[44] S. F. Galan, F. Aguado, F. J. Dıez, and J. Mira, “NasoNet. Modellingthe spread of nasopharyngeal cancer with temporal Bayesian networks,”Artificial Intelligence in Medicine, vol. 25, pp. 247–254, 2002.

[45] M. A. J. van Gerven, F. J. Dıez, B. G. Taal, and P. J. F. Lucas, “Selectingtreatment strategies with dynamic limited-memory influence diagrams,”Artificial Intelligence in Medicine, vol. 40, pp. 171–186, 2007.

[46] F. J. Dıez, M. Yebra, I. Bermejo, M. A. Palacios-Alonso, M. Arias,M. Luque, and J. Perez-Martın, “Markov influence diagrams: A graph-

Page 20: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

XVIII Conferencia de la Asociacion Espanola para la Inteligencia Artificial

1284

ical tool for cost-effectiveness analysis,” Medical Decision Making,vol. 37, pp. 183–195, 2017.

[47] I. Bermejo, “New types of probabilistic graphical models: Applicationsto medicine,” Ph.D. dissertation, Dept. Artificial Intelligence, UNED,Madrid, Spain, 2015.

[48] J. Perez-Martın, M. A. Artaso, and F. J. Dıez, “Cost-effectiveness ofpediatric bilateral cochlear implantation in Spain,” The Laryngoscope,2017. [Online]. Available: http://dx.doi.org/10.1002/lary.26765

[49] Elvira Consortium, “Elvira: An environment for creating and usingprobabilistic graphical models,” in Proceedings of the First European

Workshop on Probabilistic Graphical Models (PGM’02), J. A. Gamezand A. Salmeron, Eds., Cuenca, Spain, 2002, pp. 1–11.

[50] M. Arias, M. A. Artaso, I. Bermejo, F. J. Dıez, M. Luque, andJ. Perez-Martın, “Advanced algorithms for medical decision analysis.Implementation in OpenMarkov,” in Proceedings of the 16th Conference

on Artificial Intelligence in Medicine (AIME 2017), Vienna, Austria,2017.

[51] I. Bermejo, J. Oliva, F. J. Dıez, and M. Arias, “Interactive learning ofBayesian networks with OpenMarkov,” pp. 27–34.

[52] I. Bermejo, “Probabilistic graphical models for the tuning of systems,”Master’s thesis, Dept. Artificial Intelligence, UNED, Madrid, Spain,2012.

[53] I. Bermejo, F. J. Dıez, P. Govaerts, and B. Vaerenberg, “A probabilisticgraphical model for tuning cochlear implants,” in Artificial Intelligence

in Medicine, N. Peek, R. Marın, and M. Peleg, Eds. Springer BerlinHeidelberg, 2013, vol. 7885, pp. 150–155.

[54] F. J. Dıez and S. F. Galan, “Efficient computation for the noisy MAX,”International Journal of Intelligent Systems, vol. 18, pp. 165–177, 2003.

[55] M. Luque and F. J. Dıez, “Variable elimination for influence dia-grams with super-value nodes,” in Proceedings of the Second European

Workshop on Probabilistic Graphical Models (PGM’04), P. Lucas, Ed.,Leiden, The Netherlands, 2004, pp. 145–152.

[56] M. Arias and F. J. Dıez, “Operating with potentials of discrete variables,”International Journal of Approximate Reasoning, vol. 46, pp. 166 – 187,2007.

[57] ——, “Computation of cost and effectiveness in decision trees withembedded decision nodes,” in Annual Meeting of the Society for Medical

Decision Making, Pittsburgh, 2007.

[58] M. Luque, T. D. Nielsen, and F. V. Jensen, “An anytime algorithmfor evaluating unconstrained influence diagrams,” in Proceedings of

the Fourth European Workshop on Probabilistic Graphical Models

(PGM’08), M. Jaeger and T. D. Nielsen, Eds., Hirtshals, Denmark, 2008,pp. 177–184.

[59] M. Luque and F. J. Dıez, “Variable elimination for influence diagramswith super-value nodes,” International Journal of Approximate Reason-

ing, vol. 51, pp. 615–631, 2010.

[60] M. Luque, T. D. Nielsen, and F. V. Jensen, “Anytime decision-makingbased on unconstrained influence diagrams,” International Journal of

Intelligent Systems, vol. 31, pp. 379–398, 2016.

[61] M. Luque, M. Arias, and F. J. Dıez, “Synthesis of strategies in influencediagrams,” in Proceedings of the Thirty-third Conference on Uncertainty

in Artificial Intelligence (UAI’17). Corvallis, OR: AUAI Press, 2017,pp. 1–9.

[62] C. Lacave and F. J. Dıez, “A review of explanation methods for Bayesiannetworks,” Knowledge Engineering Review, vol. 17, pp. 107–127, 2002.

[63] C. Lacave, A. Onisko, and F. J. Dıez, “Use of Elvira’s explanationfacilities for debugging probabilistic expert systems,” Knowledge-Based

Systems, vol. 19, pp. 730–738, 2006.

[64] M. Luque and F. J. Dıez, “Decision analysis with influence diagramsusing Elvira’s explanation capabilities,” in Proceedings of the Third

European Workshop on Probabilistic Graphical Models (PGM’06),M. Studeny and J. Vomlel, Eds., Prague, Czech Republic, 2006, pp.179–186.

[65] C. Lacave, M. Luque, and F. J. Dıez, “Explanation of Bayesian networksand influence diagrams in Elvira,” IEEE Transactions on Systems, Man

and Cybernetics—Part B: Cybernetics, vol. 37, pp. 952–965, 2007.

[66] F. Elizalde, L. E. Sucar, M. Luque, F. J. Dıez, and A. Reyes, “Policyexplanation in factored Markov decision processes,” in Proceedings

of the Fourth European Workshop on Probabilistic Graphical Models

(PGM’08), M. Jaeger and T. D. Nielsen, Eds., Hirtshals, Denmark, 2008,pp. 97–104.

[67] F. J. Dıez, “Teaching probabilistic medical reasoning with the Elvirasoftware,” in IMIA Yearbook of Medical Informatics, R. Haux andC. Kulikowski, Eds. Sttutgart, Germany: Schattauer, 2004, pp. 175–180.

[68] F. J. Dıez, I. Parıs, J. Perez-Martın, and M. Arias, “Teaching Bayesiannetworks with OpenMarkov,” in 9th International Conference on Prob-

abilistic Graphical Models, Prague, Czech Republic, 2018.[69] M. Jaeger and T. D. Nielsen, Eds., Proceedings of the Fourth European

Workshop on Probabilistic Graphical Models (PGM’08), Hirtshals,Denmark, 2008.

Page 21: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

I Workshop de Grupos de Investigacion Espanoles de IA en Biomedicina

1285

Research Group on Artificial Intelligence at

Universitat Rovira i Virgili (Tarragona)

David Riaño

Universitat Rovira i Virgili

Tarragona, Spain

[email protected]

Abstract—This paper describes a selection of some of the most

representative research lines of the Research Group on Artificial

Intelligence (BANZAI) at the Universitat Rovira i Virgili

(Tarragona) in the area of Artificial Intelligence in Medicine and

the main results achieved. These research interests are: the

induction of clinical algorithms, the representation of knowledge

to train clinicians, and the study of multimorbidity. Moreover,

the paper describes the current and future research interests of

the group.

Keywords—Research Group on Artificial Intelligence

(BANZAI); artificial intelligence in biomedicine; description of

research lines and projects.

I. INTRODUCTION

The Research Group on Artificial Intelligence BANZAI1, at the Universitat Rovira i Virgili (Tarragona), was founded in 1998. In the last two decades, the work of the group has been centered in the topics of Intelligent Data Analysis, Knowledge Representation in Health Care, and Clinical Decision Support Systems, in collaboration with several hospitals in Barcelona, Tarragona, and Reus.

In 2000 the group collaborated with the Hospital Universitari Joan XXIII (Tarragona) in the project COSYS: Sistema de Información Hospitalaria de Ayuda a la Adaptación de Pesos Relativos a los GRD. In this project, a tool was developed to classify patients in order to simplify the comparison between the real costs of inpatients and the relative weights of the Diagnostic Related Groups (DRG) [1][2]. In 2003, they worked with the Palliative Care Unit at the Hospital de la Santa Creu i Sant Pau (Barcelona) in the project Palliasys: Use of new ICT to facilitate the treatment of palliative patients in order to implement a eHealth system to assist patients requiring palliative care at home, but also in the analysis of retrospective data about patients attended in the unit in order to find out medical protocols of efficient medical practices [3]-[10]. In 2006, they coordinated the project Hygia with the Hospital Clínic de Barcelona, the University of Santiago de Compostela and Universitat Jaume I. The contribution of BANZAI in the project was the analysis of

1 http://banzai-deim.urv.cat

retrospective data to construct treatment models for chronic patients [11]-[13]. Later on, in 2006, they coordinated the project K4CARE: Knowledge-Based Home-Care eServices for an Ageing Europe that counted on several hospitals in Europe, apart of some technological partners [14]. The project combined healthcare and ICT experiences of several western and eastern EU countries to create, implement, and validate a knowledge-based health-care model for the professional assistance to senior patients at home [15]-[17].

In the last eleven years, this research group has organized international workshops every year in the field of health-care knowledge representation. Since 2009, this workshop is called K4CARE: Knowledge Representation for Health-Care and, in 2018, it was the tenth edition. They are also editors of the Springer LNAI book series on "Knowledge Representation for Health Care" (eight books published until 2018) containing the extended versions of a selection of the best papers in those workshops.

The rest of the paper is organized in two sections: in the first one, three research lines showing the current interests of the BANZAI group are described. In the second one, a short discussion of the future interests is provided with conclusions.

II. RESEARCH LINES AND RESULTS

A. Induction of Clinical Algorithms

Chronic patients are characterized by requiring continuous follow-up for long periods of time, years, or even for lifetime. This timeline process is described by the clinical concept of Episode of Care (EOC) which can be defined as the services provided to a patient within a specific problem during a certain period of time. The EOC are based on the idea of clinical encounter in which the health-care professional (physician or nurse) observes the clinical condition of the patient (state), decides which issues require attention, and suggest clinical actions to address these issues. An EOC is then a sequence of encounters, and each encounter simplified as a vector (S, D, A) where S stands for the state of the patient, D are the decided issues requiring attention, and A the clinical actions performed or started in that particular encounter to address D.

Page 22: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

XVIII Conferencia de la Asociacion Espanola para la Inteligencia Artificial

1286

Primary care services register all this information in their databases or in the patients EHRs.

In modern medicine, clinical quality is related to the sort of evidences registered in the clinical practice guidelines. These are documents gathering all the clinical knowledge available for specific diseases. Therefore, the quality of medical practice uses to be related to the level of adherence of clinicians and patients to the evidences in these guidelines. Unfortunately, it is not easy to check adherence. However, in order to promote adherence, clinical processes can be simplified and represented as intuitive clinical algorithms such as the one in Fig. 1, and finally incorporated to the guidelines.

FIGURE I. CLINICAL ALGORITHM

In this context, BANZAI and SAGESSA2 worked together to automate the process of clinical algorithm extraction from EOC databases. Clustering technologies were developed to isolate relevant states and treatment blocks (circles and squares in Fig. 1) from EOC data, and combined algorithms to induce decision trees were proposed to intertwine states and actions with decisions (diamonds in Fig. 1) [18]. These algorithms were used to compare clinical procedures across institutions and in contrast with clinical algorithms recommended in different clinical practice guidelines [19].

B. Knowledge Representation for Clinical Training

Clinical practice is a complex task in which several decision problems combine in sophisticated models, as the one proposed by BANZAI [20]: the MPM. Three of the main decision problems in these models are diagnosis, treatment, and prognosis, though these can comprise multiple subtypes.

2 SAGESSA is an entity that manages all the clinical services

in the south of Tarragona, including four hospitals, five

primary care centers, and four rehabilitation centers.

Representing medical knowledge with AI structures to support decisions for these three decision problems starts with the selection of appropriate representation models. One of these models are decision tables [21][22]. Interestingly, once constructed and validated, these knowledge structures are not only useful to support decision making, but also to train novel clinicians to make the right decisions.

In collaboration with the Emergency Department of the Clinical Hospital in Barcelona (HCB), the BANZAI group worked to formalize the knowledge available in multiple clinical practice guidelines as grouping decision tables. These tables were wrapped by a web-based system to train residents of the HCB and to analyze the benefits of this sort of computer-based learning in medicine. The training system, whose functional architecture is shown in Fig. 2, was used to improve residents skills in (1) applying differential diagnosis to secondary causes of hypertension [23], (2) providing the right treatment of circulatory shock in ERs and ICUs [24], and (3) simulating emergency shock to prognosticate patient evolutions [25].

FIGURE II. SHOCK-INSTRUCTOR TRAINING TOOL ARCHITECTURE

These studies concluded that the use of knowledge-based e-learning computer tools can significantly improve the learning curve, the adherence, and the clinical results of novel residents beyond traditional formative programs in schools of medicine, and hospitals.

C. Dealing with Multimorbidity

Multimorbidity is the simultaneous observation of more than one disease in the same patient. This is a common clinical condition of elderly patients, more inclined to suffer several chronic ailments. The management of multimorbid patients is challenging because of many reasons such as some issues related to polypharmacy or the lack of evidence about multi-morbid cases. On the contrary, the number of multimorbid cases increases year after year, and consequently the amount of clinical data generated about the treatment and evolutions of these cases is large and growing rapidly. We performed a review of CS technologies applied to the management of multimorbid patients [26] whose results are summarized in Fig. 3. Three alternative technologies were identified: knowledge integration, treatment integration, and data integration.

Page 23: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

I Workshop de Grupos de Investigacion Espanoles de IA en Biomedicina

1287

FIGURE III. COMPARISON OF CS TECHNOLOGIES FOR

MULTIMORBIDITY

Knowledge integration is about the combination of single-disease knowledge structures (and additional clinical knowledge) to generate multimorbid knowledge structures. Several works exist following this approach, among which the BANZAI group proposed a technology that adapts general Formal Intervention Plans (similar to clinical algorithms, seen in Fig. 1) into Individual Intervention Plans according to the constraints of an ontology containing complementary clinical knowledge and the EHR of the patient. Individual Intervention Plans of different diseases for the same patient can be combined into a Unified Intervention Plan to define holistic personalized treatment for that multimorbid patient.

Treatment integration is seen as the combination of clinical lists of actions (including prescriptions and procedures) by solving cross interactions between actions for different diseases. In [27], a rule-based system to combine treatments of hypertension and/or diabetes mellitus and/or heart failure was proposed and tested on twenty multimorbid patients. In a later work [28], we also proposed a semi-automatic methodology to combine treatments that are expressed under a "global" structure of patient management in primary, secondary, and tertiary care.

Data integration is based on the idea of extracting knowledge about the treatment of multimorbid patients, from the intelligent analysis (machine learning) of databases about the management of this sort of patients. Our approach, different from other process mining approaches, is a machine learning process in which EOC about multimorbid patients are transformed into SDA clinical algorithms (Fig. 1) [12][29].

III. FUTURE LINES AND CONCLUSIONS

Today, some of the above described research lines of the Research Group on Artificial Intelligence (BANZAI) have converged in some interesting and challenging issues that will determine the course of our future research. Here, we describe four of them, broadly.

The analysis of ICU data: Intensive Care Units are singular services in the sense that they produce a huge amount of data per patient, the related costs are high, the patients are very sensitive and they may require rapid interventions by coordinated (sometimes multidisciplinary) expert teams. We are in close collaboration with the ICU of the Hospital Universitari Joan XXIII (Tarragona) in order to define a big data for scientific exploitation, similar to the USA database MIMIC III. In this collaboration, our main contribution is the analysis of these data with data mining technologies and the construction of descriptive and predictive models to solve important medical questions such as the length of stay, the parameters affecting the length of stay, advanced detection of patient's complications and changes of state, the accurate use of antibiotics and the pharmacological consumption in general, or the adherence to ICU guidelines.

The analysis of cancer data: We are concluding a preliminary longitudinal study on breast cancer data coming from the SEER database. We are also participating in the P-BreasTreat project, one of whose objectives is the combination of visual data extracted from the analysis of images, with clinical data taken from the patient EHR to classify cancer's typology. This work is in conjunction with the Oncology Service of the Hospital Sant Joan (Reus).

Multimorbidity: Although, we achieved considerable results in this topic in the past, there are still some open questions that we would like to address in our close future. Just to mention some, we are interested in determining whether there are significant differences in the interactions detected by different publicly available drug interaction checkers, or the automatic construction of diagrams to evidence the differences between the treatments of one disease for monomorbid patients, in comparison with the treatment of this same disease in multimorbid patients. Gaining insight in multimorbidity management will influence the next research line for the future.

Data synthetization: Data about clinical processes and patients are highly sensitive and subject to strict legal restrictions. Anonymization is a way to exempt the data analyzer from legal issues, but it does not solve the case of disambiguation by crossing databases. An alternative approach is to construct mechanisms for data synthetization. In one preliminary study [30] we identified three alternative methods for clinical data synthetization: statistical, knowledge representation, and simulation. The statistical method uses the available data about real patients to construct a statistical model of the clinical parameters, and then a generator uses this model to synthesize treatments about fictitious patients. The knowledge representation method integrates the available knowledge (in clinical guidelines, health-care ontologies, electronic books, web sites, etc.) into a knowledge base and then it synthesizes fake treatments that are consistent with the knowledge base. Finally, the simulation method depends on the availability of a patient (or signs) simulator that emulates the evolution of that patient or signs when a treatment is carried out. Combining the simulator with a treatment recommender system, and making these elements to loop, a database can be synthesized containing specific fictitious treatments.

Page 24: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

XVIII Conferencia de la Asociacion Espanola para la Inteligencia Artificial

1288

In conclusion, the Research Group in Artificial Intelligence at the Universitat Rovira i Virgili (Tarragona) has been working in the area of medical informatics since 1998 with two main technological focuses: intelligent data analysis and knowledge representation. We have a wide trajectory collaborating with multiple hospitals and health-care centers in general. Our current and future interests are in the analysis of ICU and cancer data, and the synthetization of EHR data for multimorbid conditions.

ACKNOWLEDGMENT

This paper has been possible thanks to the RETOS P-BreasTreat project (DPI2016-77415-R) of the Spanish Ministerio de Economia y Competitividad.

We want to thank all the physicians that have collaborated with our group along these years: Dr. Xavier Allué, Dr. Montserrat Olona, Dr. Antonio Pascual, Dr. Silvestre Martín, Dr. Albert Alonso, Dr. Patrizia Meccoci, Dr. María Bodí, Dr. Alejandro Rodríguez, and primarily to Dr. Antoni Collado, Dr. José Ramón Alonso, Dr. Fabio Campana, and Dr. Roberta Annicchiarico. We are also thankful to all the current and past CS members of the group: Dr. John A Bohada, Dr. Joan Albert López-Vallverdú, Dr. Aida Kamisalik, Dr. Francis Real, Ms. Susana Prado, Mr. Mohammed Sayed, Mr. Wilfrido Ortega, and undergraduate and master students.

REFERENCES

[1] Riaño D, Prado S, "The analysis of hospital episodes," Second Int. Symposium on Medical Data Analysis (ISMDA 2001), In: Medical Data Analysis, LNCS 2199, Jose Crespo, Victor Maojo and Fernando Martin (Eds.), Madrid, Spain. 2001.

[2] Riaño D, Prado S, "The Study of Medical Costs with Intelligent Information Systems," 15th IEEE Symposium on Computer Based Medical Systems (CBMS 2002), Maribor, Slovenia, 2002.

[3] Riaño D, Prado S, Pascual A, Martín S, "A Multiagent System Model to Support Palliative Care Units," 15th IEEE Symposium on Computer Based Medical Systems ( CBMS 2002 ), Maribor, Slovenia, 2002.

[4] A. Pascual, Riaño D, Prado S, López C, Martín S, "Un Modelo de Sistema Multiagente en un Programa de Cuidados Paliativos," Congreso Nacional de Paliativos, Granada, 2002.

[5] Moreno A., Valls A., Riaño D, "Improving Palliative Care with Agent Technology," Workshop ECAI, 2004.

[6] Riaño D, Moreno A, Valls A, "Palliasys: Agent Based Palliative Care," IEEE 4th Conf. on Intelligent Systems Design and Applications (ISDA'04), ISBN 963 7154 29 9, Budapest, Hungría, 2004.

[7] Moreno A, Riaño D, Pascual A, Valls A, Mallafré X, "Sistema Telemático para la gestión de Unidades de Curas Paliativas," Informed 2004, Barcelona, 2004.

[8] Valls A, Moreno A, Riaño D, Pascual A, "Modelo de e-Asistencia Basado en las Tecnologías de SMAs y Análisis Inteligente de Datos," Informed 2004, Barcelona, 2004.

[9] Moreno A, Riaño D, Valls A, "Agent-based alarm management in a Palliative Care Unit," 3rd Workshop on Agents Applied in Health Care, IJCAI 2005, Edinburgh, 2005.

[10] Moreno A, Valls A, Riaño D, "PalliaSys: agent-based proactive monitoring of palliative patients," 4th Int Workshop on Practical Applications of Agents and Multi-Agent Systems, IWPAAMS-05, p. 101-110, León, 2005.

[11] Real F, Riaño D, (2008) Automatic Combination of Formal Intervention Plans Using SDA* Representation Model. In: Riaño D. (eds) Knowledge

Management for Health Care Procedures. K4CARE 2007. Lecture Notes in Computer Science, vol 4924. Springer, Berlin.

[12] Riaño D, López-Vallverdú JA, Tu S, (2008) Mining Hospital Data to Learn SDA* Clinical Algorithms. In: Riaño D. (eds) Knowledge Management for Health Care Procedures. K4CARE 2007. Lecture Notes in Computer Science, vol 4924. Springer.

[13] Lozano E, Marcos M, Martínez-Salvador B, Alonso A, Alonso JR, (2010) Experiences in the Development of Electronic Care Plans for the Management of Comorbidities. In: Riaño D., ten Teije A., Miksch S., Peleg M. (eds) Knowledge Representation for Health-Care. Data, Processes and Guidelines. KR4HC 2009. LNCS 5943. Springer.

[14] Riaño D, Campana F, Annicchiarico R, Ercolani S, Federici A, Mecocci P, "K4CARE: a new intelligent system for home care," 6th Int. Conf. of the International Society for Gerontechnology, Pisa, Italy, 2008. Journal of the Int. Society for Gerontotechnology 7(2): 195, 2008.

[15] Spiru L, Turcu I, Ioancio I, Nuta C, Ghita C, Martin M, Annicchiarico R, Cortés U, Riaño D, "E-Health and Assistive Technology (AT) as suitable answers to global aging," Alzheimer's and Dementia: Journal of the Alzheimer's Association 5(4) supplement: 241, 2009.

[16] Riaño D, Real F, Campana F, Ercolani S, Annicchiarico R, "An Ontology for the Care of the Elder at Home," AIME'09, Jul 18-22, Verona, Italy, 2009.

[17] Riaño D, Real F, López-Vallverdú JA, Campana F, Ercolani S, Mecocci P, Annicchiarico R, Caltagirone C, "An ontology-based personalization of health-care knowledge to support clinical decisions for chronically ill patients," Int. Journal of Biomedical Informatics. 45(3): 429-446, 2012. Also published in the JBI Virtual Issue on Computer-interpretable Clinical Guidelines (Ed. Mor Peleg), 2013.

[18] López-Vallverdú JA, Riaño D, Bohada JA, "Improving medical decision trees by combining relevant health-care criteria," Expert Systems with Applications, 39(14): 11782-11791, 2012.

[19] Bohada JA, Riaño D, Lopez-Vallverdú JA, "Automatic Generation of Clinical Algorithms within the SDA Model," Expert Systems with Applications, 39(12): 10709-10721, 2012.

[20] Riaño D, Bohada JA, Collado A, López-Vallverdú JA, "MPM: A Knowledge-Based Model of Medical Practice," Int. Journal of Biomedical Informatics, 46(3): 379–387, 2013.

[21] Riaño D, "A Systematic Analysis of Medical Decisions. How to Store Knowledge and Experience in Decision Tables," LNAI 6924. pp 23-36, 2012.

[22] Real F, "Use of Decision Tables to Model Assistance Knowledge to Train Medical Residents," PhD Thesis Dissertation, URV, 2016.

[23] Real F, Riaño D, Alonso JR, "Training Residents in the Application of Clinical Guidelines for Differential Diagnosis of the most Frequent Causes of Arterial Hypertension with Decision Tables," 6th Int. Wshp. on Knowledge Representation for Health-Care, KR4HC 2014, Vienna, Austria.

[24] Riaño D, Real F, Alonso JR, "Improving resident's skills in the management of circulatory shock with a knowledge-based e-learning tool," Int J Med Inform. 2018 113:49-55.

[25] Real F, Riaño D, Alonso JR, "A Patient Simulation Model Based on Decision Tables for Emergency Shocks," 7th Int. Wshp. on Knowledge Representation for Health-Care, KR4HC 2015, Pavia, Italy.

[26] Riaño D, Ortega W, "Computer technologies to integrate medical treatments to manage multimorbidity," JBI 75 (2017).

[27] López-Vallverdú JA, Riaño D, Collado A, "Rule-Based Combination of Comorbid Treatments for Chronic Diseases Applied to Hypertension, Diabetes Mellitus, and Heart Failure," LNAI 7738, 30-41, 2013.

[28] Riaño D, Collado A. "Model-Based Combination of Treatments for the Management of Chronic Comorbid Patients," 14th Int. Conf. on Artificial Intelligence in Medicine, Murcia, Spain. In: Artificial Intelligence in Medicine. Springer LNAI 7885, 11-16.

[29] López-Vallverdú JA, Knowledge-Based Incremental Induction of Clinical Algorithms, PhD Thesis Dissertation, 2012.

[30] Riaño D., Fernández-Pérez A, "Simulation-Based Episodes of Care Data Synthetization for Chronic Disease Patients," Knowledge Representation for Health Care. ProHealth 2016, KR4HC 2016. LNCS 10096. Springer.

Page 25: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

I Workshop de Grupos de Investigacion Espanoles de IA en Biomedicina

1289

Research Topics in Computer-Interpretable

Guidelines

Mar Marcos, Begona Martınez-Salvador

Dept. of Computer Engineering and Science

Universitat Jaume I

Castellon, Spain

{mar.marcos, begona.martinez}@uji.es

Cristina Campos, Reyes Grangel

Dept. of Computer Languages and Systems

Universitat Jaume I

Castellon, Spain

{camposc, grangel}@uji.es

Abstract—This paper describes some lines of research of theKnowledge Engineering group at Universitat Jaume I, whichinvestigates the construction of knowledge-based systems withan emphasis on practical applications. In particular, the paperpresents a selection of research lines in the context of Computer-Interpretable Guidelines, a topic in which the group has spe-cialised over the last years.

Index Terms—Knowledge Engineering, Clinical Decision-Support Systems, Clinical Practice Guidelines, Computer-Interpretable Guidelines

I. INTRODUCTION: KNOWLEDGE ENGINEERING GROUP AT

UNIVERSITAT JAUME I

This paper describes the main research activities of the

Knowledge Engineering (KE) group at Universitat Jaume I.

The group investigates the construction of knowledge-based

systems, with a special emphasis on practical applications in

the field of Medicine. The KE group was founded in 2003

as a result of the consolidation of research activities initiated

in collaboration with groups of other universities and research

centers at European level, in the framework of the EU projects

Protocure I and II [1], [2]. Among the medical applications,

the KE group has been working for a long time on decision-

support systems developed from Clinical Practice Guidelines.

Another topic the KE group has payed attention to is the

quality assurance of knowledge-based systems in general, and

of clinical decision-support systems in particular. The group is

also interested in knowledge representation in different fields

(e.g. Medicine, Enterprise) and using different approaches (e.g.

rule-based, ontologies, business process models).

II. CONTEXT: COMPUTER-INTERPRETABLE GUIDELINES

In the field of Medicine, the KE group has specialised for

some time now in Clinical Practice Guidelines (CPGs) and

their counterpart in electronic format, the so-called Computer-

Interpretable Guidelines (CIGs). According to the most recent

definition, CPGs are “statements that include recommenda-

tions intended to optimize patient care that are informed by

a systematic review of evidence and an assessment of the

benefits and harms of alternative care options” [3]. Research

This work has been partially funded by the Spanish Ministry of Economy,Industry and Competitiveness and the European Regional Development Fund(ERDF) Programme through grant TIN2014-53749-C2-1-R.

has demonstrated that CPGs have the potential to improve the

quality and outcomes of healthcare. To achieve these benefits,

CPG recommendations should be made available to clinicians

where and when they are needed [4]. Although this can be

done using CPGs in text form, there is consensus that the most

effective way is by converting them to a computer-interpretable

format [5]. Thus, CIGs can be defined as electronic versions

of CPG documents to be executed as part of decision-support

systems.

CPGs are developed using methods that incorporate princi-

ples of Evidence-Based Medicine and consensus recommen-

dations made by panels of experts, in the context of a specific

clinical condition. As for their appearance and structure,

CPGs are more or less lengthy documents describing neatly

and in detail the recommended diagnostic and therapeutic

interventions. CPGs often include tables summarising the key

recommendations as well as flowcharts structuring the inter-

ventions, for ease of reference. The information provided in

summary tables usually includes the class of recommendation,

the level of evidence and the supporting literature reference(s).

As an illustration of CPG documents, the reader is referred to

e.g. the 2012 ESC guidelines for the diagnosis and treatment

of acute and chronic heart failure [6].

CIGs have been an active topic in the Artificial Intelli-

gence (AI) in Medicine field for over 20 years. The emergence

of CIGs was motivated by the interest in making CPG recom-

mendations available to clinicians in an easier and immediate

way, as opposed to CPG documents. In a review work, Peleg

identifies a total of 8 research themes in the CIG area [4]:

1) Representation languages

2) Acquisition and specification methodologies

3) Integration with Electronic Health Record (EHR) sys-

tems and with workflow systems

4) Verification & validation

5) Execution engines and other CIG-related tools

6) Exception and error handling

7) Maintenance

8) Sharing and reuse

The KE group has been involved in projects related to

several of the above themes, concretely: verification & vali-

dation, acquisition and specification, sharing and reuse, and

Page 26: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

XVIII Conferencia de la Asociacion Espanola para la Inteligencia Artificial

1290

interoperability with EHR systems. Besides, the group has

wide experience in CIG modelling using the PROforma repre-

sentation language [7]. The following sections briefly describe

a selection of the lines of research in which the group has been

recently working, and the ongoing projects.

III. RESEARCH LINES

A. Interoperability between CIG and EHR systems

1) Biomedical problem and its impact: To be effective, CIG

systems must be integrated within clinical information sys-

tems. The interoperability with the EHR is the main obstacle

hindering this integration. On the side of the EHR the main

problem is the heterogeneity of clinical data sources, which

may differ in used data models, schemas, naming conventions

and level of detail. On the side of CIGs the issue is that used

data is often at a level of abstraction much higher than that

of the clinical data in the EHR (the so-called “impedance

mismatch”).

The potential impact of interoperability solutions providing

access to EHR data from CIG systems can be huge for

healthcare quality. Additionally, applications related to the

secondary use of EHR data, e.g. in clinical research, can

benefit from the same kind of solutions.

2) Research team(s): The KE group first worked in this line

of research in collaboration with the Instituto de Aplicaciones

de las Tecnologıas de la Informacion y de las Comunicaciones

Avanzadas (ITACA), Universitat Politecnica de Valencia, and

with clinical collaborators from the Oncological Institute at

Hospital Provincial de Castellon [8], [9]. In a different case

study, the same teams worked with the Departamento de

Informatica y Sistemas, Universidad de Murcia, and with

clinical experts from the Fundacion para la Formacion e

Investigacion Sanitaria of Murcia [10].

3) Approach: The approach seeks to solve the interoper-

ability problems between CIG and EHR systems using generic

EHR architectures, concretely openEHR archetypes (standard-

ised information models for specific clinical concepts). It can

be summarised in the following steps (see also Figure 1):

• An appropriate integration archetype (or archetypes) must

be designed for the data/concepts used in the CIG

• It must be ensured that the CIG includes references to

this archetype in the parts where interactions with the

EHR are required

• It must be ensured that the connection with the EHR

through the integration archetype is feasible, which im-

plies the definition of a series of mappings between the

elements of the archetype and the data elements of the

EHR

B. Utilisation of workflow patterns in the acquisition of CIG

procedural knowledge

1) Biomedical problem and its impact: CPGs are difficult to

understand and formalise, because they are aimed at clinicians

with specialised background knowledge. Moreover, CPGs are

often composed of sets of procedures with logical gaps or

Fig. 1. Linking a CIG system to different EHR systems (extracted from Fig. 4by Marcos et al. [9]).

contradictions. On the other hand, CIG representation lan-

guages provide a wide range of modelling constructs tailored

to the different knowledge types found in CPGs (procedures,

decision criteria, abstract concepts, etc.). This makes CIG lan-

guages poorly accessible and understandable for clinicians, in

general. As result of these factors, CIG knowledge acquisition

is usually carried out by joint teams made up of clinical and IT

experts. This line of work seeks to facilitate the initial phases

of knowledge acquisition of CIGs, by providing procedural

patterns in a notation that can be further refined into different

target CIG languages.

The use of patterns can reduce modelling time and enable

stakeholders to communicate more precisely and in a less

ambiguous way, with consequent benefits. Besides, a faster

modelling can serve to translate CPG recommendations almost

immediately to a CIG system that can be used at the point of

care, which would have a positive impact on healthcare quality.

2) Research team(s): This work was carried out jointly by

the KE group and the Institute of Software Technology &

Interactive Systems, Vienna University of Technology [11].

3) Approach: The overall idea was to provide a series

of procedural patterns in a notation (1) that is intuitive for

clinicians and (2) that can be easily refined into different CIG

languages. For the patterns, the so-called workflow control

patterns, which are frequent task (and control) structures

identified in the fields of workflow systems and business

process modelling, were used. As implementation-independent

notation for the description of the patterns, the BPMN notation

was chosen. The most important result was the analysis of the

adequacy of workflow control patterns for the representation

of CPG procedural knowledge, and the identification of addi-

tional patterns, using a sample of CPG texts [11].

C. Transformation algorithms for the acquisition of CIG pro-

cedural knowledge

1) Biomedical problem and its impact: With a similar

motivation (and potential impact) of that of the previous work,

this line of research seeks to provide support for the refinement

of CPG procedures from an initial specification in BPMN to

Page 27: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

I Workshop de Grupos de Investigacion Espanoles de IA en Biomedicina

1291

an implementation in a CIG representation language, possibly

different ones, in a (semi)automatic way (see Figure 2).

Fig. 2. Supporting the refinement of clinical process models into CIG models.

2) Research team(s): This work was carried out by the

KE group, in collaboration with the Research Group on

Artificial Intelligence, Universitat Rovira i Virgili, in some of

its parts [12], [13].

3) Approach: As mentioned above, BPMN was used as

notation for the specification of CIGs. As for the target CIG

representation languages, PROforma [13] and SDA [12] were

considered. In both cases the transformation algorithm follows

a structure identification strategy, which focuses on certain

structures of interest in the target language and identifies the

source BPMN structures that correspond to the former ones.

The algorithm works with source BPMN models/graphs which

may contain sub-graphs, which typically occur in the case of

CPGs.

IV. ONGOING WORK

Current research activities of the KE group are mainly

dedicated to the CLIN-IK-LINKS project, “Clinical Informa-

tion and Knowledge Models for Linking Electronic Health

Record and Clinical Decision Support Systems”, funded by the

Spanish Ministry of Economy and Competitiveness (reference

TIN2014-53749-C2-1-R). The aim of this project is to enable

interoperability of EHR and CDS systems in an effective

and efficient way (research theme #3 in Peleg’s review [4]).

For this purpose, we investigate the combination and joint

exploitation of the most advanced technologies for informa-

tion, domain, and inference models. The project is carried

out in collaboration with the Departamento de Informatica

y Sistemas, Universidad de Murcia, and with the company

VeraTech for Health SL. Within this project, we are developing

a platform for the configuration and execution of web services

for clinical data transformation and reasoning processes [14].

The project also includes tasks addressing the execution of

interoperable CIGs (combination of research themes #3 and

#5 in Peleg’s list).

Other ongoing research activities are related to CIG knowl-

edge acquisition, and to CIG verification & validation (re-

search themes #2 and #4 in Peleg’s list). As a continuation

of the work described in section III-C1, we are currently

working on model transformation methods for CIGs using

Model-Driven Engineering (MDE) tools. In connection with

the verification & validation of CIGs, we are working on

structural metrics that can be used to assess the quality of CIG

model design (e.g. to detect certain maintainability issues), in

line with what is done in the Software Engineering field.

V. CONCLUSIONS AND OUTLOOK

The field of Medicine in general and the CPG/CIG area in

particular present challenges and opportunities requiring new

methods and techniques that can be addressed from an AI

perspective. To take one example, the acquisition of knowledge

of CPGs and the development of CIGs is usually tackled once

the final CPG texts have been published. In this context, we

are interested in the design of environments for knowledge

management of CPGs, following the directives of medical

organisations and taking into account the requirements for the

further development of different types of support tools, CIGs

or others.

REFERENCES

[1] A. ten Teije, M. Marcos, M. Balser, J. Van Croonenborg, C. Duelli,F. van Harmelen, P. Lucas, S. Miksch, W. Reif, K. Rosenbrand, andA. Seyfang. Improving medical protocols by formal methods. Artif

Intell Med, 36(3):193–209, 2006.

[2] M. Balser, O. Coltell, J. Van Croonenborg, C. Duelli, F. Van Harmelen,A. Jovell, P. Lucas, M. Marcos, S. Miksch, W. Reif, K. Rosenbrand,A. Seyfang, and A. Ten Teije. Protocure: Supporting the developmentof medical protocols through formal methods. volume 101, pages 103–107, 2004.

[3] Institute of Medicine. Clinical Practice Guidelines We Can Trust. TheNational Academies Press, Washington, DC, 2011.

[4] M Peleg. Computer-interpretable clinical guidelines: A methodologicalreview. J Biomed Inform, 46(4):744–763, 2013.

[5] F A Sonnenberg and C G Hagerty. Computer-interpretable clinicalpractice guidelines. where are we and where are we going? Yearb Med

Inform, pages 145–158, 2006.

[6] The Task Force for the Diagnosis and Treatment of Acute and ChronicHeart Failure 2012 of the European Society of Cardiology. ESCGuidelines for the diagnosis and treatment of acute and chronic heartfailure 2012. Eur Heart J, 33(14):1787–1847, 2012.

[7] J Fox, N Johns, and A Rahmanzadeh. Disseminating medical knowledge:The PROforma approach. Artif Intell Med, 14(1-2):157–181, 1998.

[8] M Marcos, J A Maldonado, B Martınez-Salvador, D Moner, D Bosca,and M Robles. An archetype-based solution for the interoperabilityof computerised guidelines and electronic health records. In Proc. of

the 13th Conference on Artificial Intelligence in Medicine, AIME 2011.,pages 276–285, 2011.

[9] M Marcos, J A Maldonado, B Martınez-Salvador, D Bosca, and M Rob-les. Interoperability of clinical decision-support systems and electronichealth records using archetypes: A case study in clinical trial eligibility.J Biomed Inform, 46(4):676–689, 2013.

[10] J T Fernandez-Breis, J A Maldonado, M Marcos, M C Legaz-Garcıa,D Moner, J Torres-Sospedra, A Esteban-Gil, B Martınez-Salvador, andM Robles. Leveraging electronic healthcare record standards andsemantic web technologies for the identification of patient cohorts. J

Am Med Inform Assoc, 20(e2):e288–e296, 2013.

[11] K Kaiser and M Marcos. Leveraging workflow control patterns in thedomain of clinical practice guidelines. BMC Med Inform Decis Mak,16(20), 2016.

[12] B Martınez-Salvador, M Marcos, and D Riano. An Algorithm forGuideline Transformation: From BPMN to SDA. Procedia Computer

Science, 63:244 – 251, 2015. 5th International Conference on Currentand Future Trends of Information and Communication Technologies inHealthcare (ICTH-2015).

Page 28: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

XVIII Conferencia de la Asociacion Espanola para la Inteligencia Artificial

1292

[13] B Martınez-Salvador and M Marcos. Supporting the Refinement ofClinical Process Models to Computer-Interpretable Guideline Models.Bus Inform Syst Eng+, 58(5):355–366, Oct 2016.

[14] J A Maldonado, M Marcos, J T Fernandez-Breis, E Parcero, D Bosca,M C Legaz-Garcıa, B Martınez-Salvador, and M Robles. A platform forexploration into chaining of web services for clinical data transformationand reasoning. AMIA Annu Symp Proc, 2016:854863, 2016.

Page 29: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

I Workshop de Grupos de Investigacion Espanoles de IA en Biomedicina

1293

Investigacion en Inteligencia Artificial en Medicina

del grupo AIKE de la Universidad de Murcia

Manuel Campos, Bernardo Canovas-Segura, Marıa A. Cardenas, Felix Gomez de Leon, Fernando Jimenez

Jose M. Juarez, Roque Marin, Antonio Morales, Jose T. Palma

Facultad de Informatica

Universidad de Murcia

{manuelcampos|mariancv|bernardocs|gdleon|fernan|jmjuarez|roquemm|morales|jtpalma}@um.es

Francisco Palacios

Unidad de Cuidados Intensivos

Hospital Universitario de Getafe

Madrid, Spain

[email protected]

Resumen—En este trabajo presentamos la actividad del gru-po de investigacion en Inteligencia Artificial e Ingenierıa delConocimiento (AIKE) en el ambito de la Inteligencia Artificialen Medicina. En concreto resumimos sus principales lıneas deinvestigacion: razonamiento temporal, razonamientoa basado encasos, logica espacio-temporal, representacion del conocimientoy computacion evolutiva. Finalmente describimos los proyectosrealizados mas recientes en el campo de la salud.

Index Terms—Inteligencia Artificial en Medicina; Razona-miento Temporal; Logica Temporal; Representacion del Conoci-miento; Computacion Evolutiva.

I. INTRODUCCION

El grupo de Inteligencia Artificial e Ingenierıa de Co-

nocimiento (AIKE)1 de la Universidad de Murcia, fundado

en 1.993 por Roque Marın, es un grupo multidisciplinar

actualmente compuesto por 25 miembros del ambito de la

informatica, la ingenierıa, fısica y medicina. Su actividad

investigadora se centra en la aplicacion de la Inteligencia

Artificial (IA) principalmente en el dominio de la salud y

tambien en el diagnostico y la monitorizacion industrial.

En este trabajo se centra en la contribucion del grupo AIKE

en el ambito de la Inteligencia Artificial en Medicina. Este

documento se describen:

Las lıneas de investigacion en Inteligencia Artificial y

sus principales contribuciones (Sec. II).

Los proyectos desarrollados mas recientes en el ambito

de la Inteligencia Artificial aplicada a la salud (Sec. III).

Las lecciones aprendidas y los retos a abordar en trabajos

futuros (Sec. IV).

II. L INEAS DE INVESGIACION

A continuacion describimos los principales ejes de investi-

gacion y resultados del grupo AIKE a lo largo de los anos,

1http://www.um.es/aike/

incluyendo los referentes a los proyectos DAISY y WASPSS

que se describiran en la Sec. III.

II-A. Razonamiento Temporal Borroso

La lınea mas antigua del grupo AIKE se centra en metodos

de representacion y resolucion de problemas de restricciones

donde el tiempo juega un papel basico.

Uno de los principales resultados en esta lınea son las Redes

de Restricciones Temporales Borrosas (FTCN ) [1], [2]. Una

FTCN esta formada por un conjunto de variables tempora-

les y un conjunto finito de restricciones temporales binarias

definidas sobre las variables temporales. Una FTCN puede

representarse mediante una grafo dirigido, donde los nodos

representan las variables temporales y los arcos representan

las restricciones binarias temporales. A partir de aquı se han

desarrollado tecnicas de inferencia para obtener relaciones

antes desconocidas y propagar restricciones obteniendo la red

mınima. Estos resultados teoricos han sido implementados de

diferente forma para dar soluciones en distintos contextos del

ambito medico.

Tambien destacamos el software FuzzyTIME, un razo-

nador temporal de proposito general que permite trabajar

con restricciones temporales borrosos entre puntos e inter-

valos temporales. FuzzyTIME proporciona herramientas para

el mantenimiento y la consulta de informacion temporal de

las redes FTCN , utilizando sentencias en lenguaje de alto

nivel. Por ejemplo, en [3] se demuestra su utilidad para dar

soluciones en el escenario de la Gripe Aviar.

La necesidad de proporcionar descripciones de alto nivel

de la evolucion de datos es esencial en el dominio medico,

obteniendo informacion mas facil de comparar y tamano mas

reducido. Ası, se han desarrollado tecnicas de abstraccion

temporal basadas en FTCN [4] [5].

Page 30: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

XVIII Conferencia de la Asociacion Espanola para la Inteligencia Artificial

1294

II-B. Logicas Espaciales y Temporales

Una segunda lınea de investigacion basica del grupo AIKE

es el desarrollo de modelos formales para representacion de

conocimiento y razonamiento mediante logicas temporales y

espaciales.

Entre los resultados mas recientes destacamos [6] donde se

propone una logica temporal de primer orden, FTCLogic,

capaz de manejar restricciones temporales borrosas entre va-

riables temporales. Se basa en la Logica Posibilıstica e integra

el modelo FTCN en un lenguaje independiente del dominio.

Esto, ademas, la hace adecuada para una implementacion

practica en dominios como la medicina.

Desde un punto de vista aplicado, en [7] se describen los

pasos a seguir para la utilizacion de una logica temporal para

la representacion de guıas clınicas.

Varios trabajos tratan el problema de extender las logicas

temporales con el fin de obtener nuevas logicas espaciales para

razonar con direcciones cardinales entre regiones aproximadas

por rectangulos. Por un lado, en [8] se utilizan las logicas

modales de intervalos para proponer nuevas logicas modales

de rectangulos. Por otro lado, en [9] y [10] se realiza una

aproximacion algebraica al problema utilizando tecnicas de

redes de restricciones cualitativas. Los modelos mencionados

se han incorporado posterioremente a sistemas de ayuda a la

decision en entornos clınicos como se describira en la Sec. III.

II-C. Diagnostico Temporal

Otra lınea de trabajo del grupo se centra en el diagnostico,

entendiendolo como un proceso para dar una explicacion del

comportamiento observado de un sistema a partir del conjunto

de eventos observados.

Esta lınea de trabajo se centra en diagnostico temporal en

medicina, modelando el comportamiento anormal donde se re-

presentan las relaciones causales y temporales entre los hallaz-

gos anormales y las enfermedades. Estas relaciones se definen

mediante unos Patrones Diagnosticos Temporales Borrosos.

Cada patron incluye (1) conocimiento sobre el contexto, que

influye sobre la evolucion temporal de una enfermedad (2)

relaciones casusales y (3) relaciones temporales conformando

una FTCN [11].

Ademas de la representacion de modelos, se han desarrolla-

do metodos de diagnostico temporal. En [12] se describe este

tipo de procesos en dos pasos. Primero, se construye mediante

una estrategia abductiva una red causal (temporalmente con-

sistente) a partir del conjunto de eventos observados. Segundo,

algunas hipotesis son eliminadas de la red causal mediante un

metodo de eliminacion mediante poda conservativa.

Estos modelos mencionandos se han implentado, desarro-

llando herramientas de adquisicion de conocimiento (como

CATEKAT) y aplicado al diagnostico del Infarto Agudo de

Miocardio en UCI [13].

II-D. Razonamiento Basado en Casos Temporales

El Razonamiento Basado en Casos (CBR) es una metodo-

logıa que tiene por objetivo resolver problemas a partir de

problemas ya resueltos anteriormente. El grupo AIKE lleva

trabajando desde 2004 en una lınea de trabajo centrada en

CBR temporal donde los casos incluyen informacion temporal

formando secuencias de eventos o redes de restricciones. En

esta lınea trasladamos de los modelos temporales descritos en

la Sec. II-A al CBR y su posterior aplicacion en el ambito

de la medicina. Entre los trabajos de investigacion basica

destacamos el desarrollo de funciones de similitud temporal

que permiten la comparacion y recuperacion de casos como

secuencias de eventos/puntos o intervalos [14], [15]. Hemos

abordado el problema de casos que provienen de la ejecucion

de workflows y su aplicacion a guıas clınicas de infarto

cerebral [16], [17]. Tambien en el ambito de la medicina

el grupo AIKE ha trabajado en un sistema de recuperacion

de casos provenientes del historial clınico en una Unidad de

Cuidados Intensivos [18].

En la actualidad estamos trabajando en el problema de

reduccion de casos temporales (analogo a la eliminacion de

instancias) [19], [20].

II-E. Minerıa de Datos Temporales

La minerıa de datos temporales consiste en la extraccion

de patrones de los datos cuya importancia o significacion

es relevante de acuerdo a alguna medida. En la lınea de

investigacion de AIKE hemos trabajado en la extraccion de

redes de restricciones difusas temporalmente consistentes que

incluyen puntos e intervalos [21].

Con el objetivo de hacer algoritmos mas eficientes, se ha

limitado la forma de la red para obtener patrones secuenciales

con relaciones entre puntos e intervalos en colaboracion con

la Antwerpen University [22]. En la tesis de Antonio Gomariz

se ha incluido una representacion de patrones que incluye un

eficiente razonamiento temporal.

En trabajos posteriores hemos utilizado esos mismos pa-

trones secuenciales multivariables para describir la evolucion

de los pacientes de una unidad de quemados, con el objetivo

de predecir un devenir negativo que lleve al fallecimiento del

paciente [23]. Se han evaluado diversas tecnicas de discreti-

zacion de series temporales y varios clasificadores asociativos

que permiten explotar al maximo la capacidad predictiva de

los patrones.

II-F. Aprendizaje Computacional Evolutivo Multiobjetivo

La computacion evolutiva multiobjetivo es actualmente un

area de gran interes en la comunidad cientıfica y esta siendo

aplicada con exito en la mayorıa de los procesos computacio-

nales que implican multiples objetivos, entre los que destacan

los relacionados con el aprendizaje automatico.

El grupo ha desarrollado y aplicado la computacion evolu-

tiva multi-objetivo en entornos de aprendizaje supervisado y

no supervisado. Concretamente se han desarrollado algoritmos

evolutivos multiobjetivo que han sido aplicados con exito

para las tareas de seleccion de atributos, seleccion de instan-

cias, clasificacion y regresion. En [24] el algoritmo ENORA

desarrollado por el grupo se ha aplicado para seleccion de

atributos en la clasificacion de datos en centros de contacto

con multiples habilidades. En [25] ENORA se ha aplicado para

Page 31: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

I Workshop de Grupos de Investigacion Espanoles de IA en Biomedicina

1295

Drools

Rule

Engine

Ontological

Reasoner

Work ow

Engine

Datamining

Algorithms

Operational

Database

Knowledge

Base

ETL

Processes

HL7

Interface

Web-based

Java

Platform

WASPSS Decision Support System ASP GroupHospital

Information

System

Figura 1. Arquitectura de WASPSS [29].

seleccion de atributos en tareas de regresion para la prediccion

de ventas on-line. En [26], [27] ENORA ha sido utilizado

en este caso para la clasificacion borrosa de la mortalidad

por infeccion de pacientes quemados graves. Con respecto

a aprendizaje no supervisado, el algoritmo ENORA se ha

aplicado en [28] para, primero, seleccion de atributos de datos

no clasificados sobre el comportamiento de ninos en conjun-

cion con la tecnica de agrupamiento EM, y despues, para la

clasificacion borrosa de la evaluacion de su comportamiento.

III. PROYECTOS RECIENTES REALIZADOS

III-A. Proyecto WASPSS

WASPSS2(TIN2013-45491-R) es un proyecto financiado

por el MINECO en colaboracion con clınicos del Hospital

Universitario de Getafe pertenecientes a las especialidades

de UCI, microbiologıa y farmacia. En este proyecto han

participado 18 investigadores (14 especialistas en IA y 4

clınicos).

El proyecto WASPSS se centra en el problema de la apa-

ricion de multirresistencias. Esto ha llevado a los organismos

internacionales y responsables de los servicios de salud a la

definicion de polıticas que permitan mantener la eficacia de los

antibioticos, paliando la generacion de resistencias debidas a

un consumo inadecuado. En concreto los organismos publicos

han establecido programas de uso racional de los antibioticos,

conocidos como Antimicrobial Stewardship Program (ASP).

El equipo ASP es un grupo interdisciplinar de clınicos del

hospital que definen acciones a llevar a cabo en todas las

dimensiones del problema.

El desafıo que ha abordado WASPSS es el diseno de

una plataforma inteligente que permita ayudar a la gestion

de un ASP en un hospital, abordando los procesos clınicos

implicados de forma integrada. En concreto se ha abordado

el desarrollo de tecnicas y herramientas para: un modelo

de inteligencia de negocio para ASP, el estudio de series

temporales, apoyo a la decision del tratamiento antibiotico y

el soporte a la aplicacion de guıas clınicas.

El desarrollo de un sistema inteligente de estas caracterısti-

cas requiere de fuentes de datos y conocimiento provenientes

de diferente ambitos del hospital. En el proyecto hemos

estudiado el impacto de los equipos ASP en los hospitales

[30]–[32] y planteado una arquitectura que permite integrar y

2http://www.um.es/waspss/

Figura 2. Arquitectura de sistema DAISY [41].

poner a disposicion del resto de subsistemas de la informacion

y conocimiento [33]–[35].

La toma de decisiones en antibioticoterapia es compleja

y depende tanto del antibiograma, de la historia clınica del

paciente y los protocolos y guıas clınicas. En este proyecto

hemos desarrollado diferentes propuestas para la alerta en

la prescripcion [29], [36] y tecnicas de visualizacion del

antibiograma [37].

Las guıas clınicas para el tratamiento antibiotico son herra-

mientas esenciales para el medico, sin embargo su prescripcion

y dosificacion dependen en parte de la epidemiologıa local.

En el proyecto WASPSS nos hemos centrado tanto en la

representacion computacional de guıas clınicas internacionales

de referencia como de su ejecucion a nivel local. Con este

proposito hemos trabajado con modelos estandar de la indus-

tria como BPMN [38], [39], reglas de produccion y modelos

de la web semantica [40].

III-B. Proyecto DAISY

DAISY3 (15277/PI/10), financiado por la Fundacion Seneca

de la Region de Murcia, investiga en sistemas de alarmas

adaptativas y personalizadas en un entorno de monitorizacion

en el hogar enfocado a dos poblaciones objetivo: riesgo en

personas mayores que viven solas y personas mayores con

Enfermedad de Alzheimer y/o Demencia Fronto Temporal. En

este proyecto se ha contado con la colaboracion de miembros

de la Unidad de Demencias del Hospital Universitario Virgen

de la Arrixaca siendo un total de 9 investigadores (7 especia-

listas en IA y 2 clınicos).

El principal objetivo del proyecto fue analizar la actividad

de los ancianos en el hogar (logs) y proporcionar alarmas en

funcion del comportamiento. Los avances cientıficos relacio-

nados con el proyecto han sido en el ambito del desarrollo

de tecnicas de CBR temporal y valoracion de potenciales

demencias. En la Fig. 2 se describe la arquitectura del sistema

propuesto. En los problemas de generacion temprana de alertas

en dominios crıticos, la componente temporal juega un papel

fundamental. Ası las tecnicas de CBR, frente a la descripcion a

priori de situaciones de riesgo (reglas de alertas), proporcionan

de forma integrada mecanismos efectivos para la deteccion de

cambios de comportamiento del individuo de forma personali-

zada y favorecen la adaptabilidad y mantenimiento del sistema

3http://perseo.inf.um.es/∼daisycbr/

Page 32: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

XVIII Conferencia de la Asociacion Espanola para la Inteligencia Artificial

1296

de una manera sencilla. Un primer problema a resolver ha sido

el desarrollo tecnologico de herramientas de CBR temporal en

colaboracion con la West London University [42] donde se han

implementando metricas de similitud descritas en la seccion

II-D. Un segundo desafıo, debido al gran volumen de registros,

ha sido el desarrollo de tecnicas de reduccion de casos

(case-base maintenance) [20], [43]. Ası, se ha desarrollado

CELSEA4, una API de objetivo cientıfico para el estudio de

case-base maintenance [44]. Parte de estos resultados se han

desarrollado en colaboracion con investigadores de la Robert

Gordon University [45].

Un segundo aspecto de este proyecto ha sido el estudio de

la sospecha del deterioro cognitivo, en particular frente a la

Enfermedad de Alzheimer y la Demencia Fronto-Temporal en

colaboracion con neuropsicologos. El problema se ha abor-

dado desde 2 frentes distintos. Por un lado de forma ubicua

mediante el analisis de la actividad diaria con el objetivo de

detectar caıdas y comportamientos aberrantes que pudieran dar

indicios de sufrir demencia. Ası, se ha desarrollado tecnicas

de visual mining con el objetivo de detectar de una forma

visual este tipo de comportamientos [46]. Por otro lado, se han

desarrollado software para la evaluacion neuropsicologica de

forma ambulatoria, preprocesando la informacion y registrando

los tiempos de respuesta [47], [48].

IV. CONCLUSIONES

Durante los ultimos 25 anos, el grupo de investigacion

AIKE ha trabajado en el campo de la Inteligencia Artificial

para proponer soluciones en el ambito de la salud tanto en

modelos teoricos como en un desarrollo mas aplicado.

Como eje vertebrador de la investigacion consideramos

que la dimension temporal juega un papel fundamental en

la resolucion de tareas medicas. Por este motivo, el grupo

AIKE ha centrado su investigacion en el estudio de modelos

formales de representacion y razonamiento temporal desde

diferentes puntos de vista (logicas y algebras, series temporales

y secuencias, gestion fuzzy del tiempo, etc.).

El segundo eje de trabajo del grupo AIKE es el desarrollo

de la IA en hospitales para dar apoyo al medico durante

la practica clınica. Para que las herramientas desarrolladas

se implanten en el flujo de trabajo del clınico creemos que

es fundamental: (a) un profundo conocimiento del problema

clınico (adquiridos tras anos de estrecha colaboracion entre

medicos e investigadores); (b) el desarrollo de software en

produccion en hospitales que de valor anadido en el dıa a

dıa e integre la informacion clave para su explotacion; (c)

usar modelos y tecnicas de la IA que sean interpretables y

validables por parte de los clınicos.

Finalmente, entre las lıneas de trabajo y colaboracion pla-

nificadas para los proximos anos destacamos:

Representacion formal del conocimiento para definicion

de estrategias y protocolos clınicos.

Propuesta y desarrollo de seleccion de parametros clıni-

cos mediante un enfoque de optimizacion multiobjetivo.

4http://perseo.inf.um.es/∼aike/celsea/

Desarrollo de tecnicas de fenotipado a partir de algorit-

mos de subgrupos y particionamiento.

Estudio de tecnicas de visualizacion para la mejora de la

interpretabilidad medica de tecnicas de aprendizaje.

Comenzar una lınea de trabajo sobre tecnicas de predic-

cion temporal aplicada a tratamientos y consumos.

AGRADECIMIENTOS

Este trabajo ha sido parcialmente financiado por el Ministe-

rio de Economıa y Competitividad y fondos FEDER a traves

del proyecto WASPSS (Ref: TIN2013-45491-R)

REFERENCIAS

[1] R. Marın, S. B. A. Bosch, and J. Mira, “Modeling time representationfrom a fuzzy perspective,” Cybernetics and Systems, vol. 25, no. 2, pp.207–215, 1994.

[2] R. Marın, M. A. Cardenas, M. Balsa, and J. L. Sanchez, “Obtainingsolutions in fuzzy constraint networks,” International Journal of Appro-

ximate Reasoning, vol. 3-4, pp. 261–288, 1996.

[3] M. C. Martınez, J. M. Juarez, J. T. Palma, R. Marın, and F. Palacios,“Avian influenza: Temporal modeling of a human to human transmissioncase,” Expert Syst. Appl., vol. 38, no. 7, pp. 8865–8885, 2011.

[4] M. Campos, J. M. Juarez, J. Salort, J. T. Palma, and R. Marın,“Reasoning in dynamic systems: From raw data to temporal abstractinformation,” Neurocomputing, vol. 72, no. 4-6, pp. 871–878, 2009.

[5] M. C. Martınez, J. M. Juarez, J. T. Palma, and R. Marın, “Using temporalconstraints for temporal abstraction,” J. Intell. Inf. Syst., vol. 34, no. 1,pp. 57–92, 2010.

[6] M. Cardenas-Videma and R. Marin, “FTCLogic: Fuzzy temporal cons-traint logic,” Fuzzy Set and Systems, vol. (accepted), 2018.

[7] G. Sciavicco, J. M. Juarez, and M. Campos, “Quality checking of medi-cal guidelines using interval temporal logics: A case-study,” in IWINAC

(2), ser. Lecture Notes in Computer Science, vol. 5602. Springer, 2009,pp. 158–167.

[8] A. Morales, I. Navarrete, and G. Sciavicco, “A new modal logic forreasoning about space: spatial propositional neighborhood logic,” Ann.

Math. Artif. Intell., vol. 51, no. 1, pp. 1–25, 2007.

[9] I. Navarrete, A. Morales, and G. Sciavicco, “Consistency checking ofbasic cardinal constraints over connected regions,” in Proceedings of

IJCAI, 2007, pp. 495–500.

[10] I. Navarrete, A. Morales, G. Sciavicco, and M. A. C. Viedma, “Spatialreasoning with rectangular cardinal relations - the convex tractablesubalgebra,” Ann. Math. Artif. Intell., vol. 67, no. 1, pp. 31–70, 2013.

[11] J. T. Palma, J. M. Juarez, M. Campos, and R. Marın, “A fuzzy approachto temporal model-based diagnosis for intensive care units,” in ECAI.IOS Press, 2004, pp. 868–872.

[12] J. T. Palma, J. M. Juarez, M. Campos, and R. Marin, “Fuzzy theoryapproach for temporal model-based diagnosis: An application to medicaldomains,” Artificial Intelligence in Medicine, vol. 38, no. 2, pp. 197–218,2006.

[13] J. M. Juarez, M. Campos, J. T. Palma, and R. Marın, “Computingcontext-dependent temporal diagnosis in complex domains,” Expert Syst.

Appl., vol. 35, no. 3, pp. 991–1010, 2008.

[14] J. M. Juarez, F. Guil, J. T. Palma, and R. Marın, “Temporal similarityby measuring possibilistic uncertainty in CBR,” Fuzzy Sets and Systems,vol. 160, no. 2, pp. 214–230, 2009.

[15] Z. Huang, J. M. Juarez, W. Dong, L. Ji, and H. Duan, “Predictivemonitoring of local anomalies in clinical treatment processes,” in AIME,ser. LNCS, vol. 9105. Springer, 2015, pp. 25–34.

[16] C. Combi, M. Gozzi, J. M. Juarez, R. Marın, and B. Oliboni, “Queryingclinical workflows by temporal similarity,” in AIME, ser. Lecture Notesin Computer Science, vol. 4594. Springer, 2007, pp. 469–478.

[17] C. Combi, M. Gozzi, B. Oliboni, J. M. Juarez, and R. Marın, “Temporalsimilarity measures for querying clinical workflows,” Artificial Intelli-

gence in Medicine, vol. 46, no. 1, pp. 37–54, 2009.

[18] J. M. Juarez, J. Salort, J. T. Palma, and R. Marın, “Case representationontology for case retrieval systems in medical domains,” in Artificial

Intelligence and Applications. IASTED/ACTA Press, 2007, pp. 188–193.

Page 33: I Workshop de Grupos de Investigación Españoles de IA en ... · consecuentemente conocer mejor la enfermedad. El mejor y más temprano indicador de la enfermedad de Alzheimer es

I Workshop de Grupos de Investigacion Espanoles de IA en Biomedicina

1297

[19] J. M. Juarez, S. Craw, J. R. Lopez, and M. Campos, “Maintenance ofcase bases: Current algorithms after fifty years,” in Proceedings of the

Twenty-Seventh International Joint Conference on Artificial Intelligence,ser. IJCAI’18, 2018, pp. 5458–5463.

[20] E. Lupiani, J. M. Juarez, and J. T. Palma, “Evaluating case-basemaintenance algorithms,” Knowl.-Based Syst., vol. 67, pp. 180–194,2014.

[21] M. Campos, J. T. Palma, and R. Marın, “Temporal data mining withtemporal constraints,” in AIME, ser. Lecture Notes in Computer Science,vol. 4594. Springer, 2007, pp. 67–76.

[22] P. Fournier-Viger, A. Gomariz, M. Campos, and R. Thomas, “Fastvertical mining of sequential patterns using co-occurrence information,”in PAKDD (1), ser. Lecture Notes in Computer Science, vol. 8443.Springer, 2014, pp. 40–52.

[23] I. J. Casanova, M. Campos, J. M. Juarez, A. Fernandez-Fernandez-Arroyo, and J. A. Lorente, “Using multivariate sequential patterns toimprove survival prediction in intensive care burn unit,” in AIME, ser.Lecture Notes in Computer Science, vol. 9105. Springer, 2015, pp.277–286.

[24] F. Jimenez, E. Marzano, G. Sanchez, G. Sciavicco, and N. Vitacolonna,“Attribute selection via multi-objective evolutionary computation appliedto multi-skill contact center data classification,” in SSCI. IEEE, 2015,pp. 488–495.

[25] F. Jimenez, G. Sanchez, J. M. Garcıa, G. Sciavicco, and L. M. Pechuan,“Multi-objective evolutionary feature selection for online sales forecas-ting,” Neurocomputing, vol. 234, pp. 75–92, 2017.

[26] F. Jimenez, G. Sanchez, and J. M. Juarez, “Multi-objective evolutionary algorithms for fuzzy classification insurvival prediction,” Artificial Intelligence in Medicine, vol. 60,no. 3, pp. 197 – 219, 2014. [Online]. Available:http://www.sciencedirect.com/science/article/pii/S0933365713001668

[27] F. Jimenez, G. Sanchez, J. M. Juarez, J. M. Alcaraz, and J. F. Sanchez,“Fuzzy classification of mortality by infection of severe burnt patientsusing multiobjective evolutionary algorithms,” in IWINAC (1), ser.Lecture Notes in Computer Science, vol. 5601. Springer, 2009, pp.447–456.

[28] F. Jimenez, R. Jodar, M. del Pilar Martın, G. Sanchez, and G. Scia-vicco, “Unsupervised feature selection for interpretable classification inbehavioral assessment of children,” Expert Systems, vol. 34, no. 4, 2017.

[29] B. Canovas-Segura, M. Campos, A. M. Nicolas, J. M. Juarez, andF. Palacios, “Development of a clinical decision support system forantibiotic management in a hospital environment,” Progress in Artificial

Intelligence, vol. 5, no. 3, pp. 181–197, 2016.

[30] A. M. Vazquez, J. G. Septiem, I. J. T. Vasallo, D. Sanz-Rosa, M. C.Martınez, M. E. Martınez, F. P. Ortega, and J. M. J. Navalon, “Programade optimizacion de antibioticos en un servicio de cirugıa general ydigestiva: efecto sobre prescripcion de meropenem en sus dos primerosde implantacion programa de optimizacion de antibioticos en un serviciode cirugıa general y digestiva: efecto sobre prescripcion de meropenemen sus dos primeros de implantacion,” Revista de Cirugıa Espanola,vol. 94, no. Noviembre 2016, p. 135, November 2016.

[31] J. G. Septiem, M. E. Martınez, A. M. Vazquez, F. Palacios, A. S. Grande,C. B. Recuenco, and J. M. J. Navalon, “Impacto de implantacion delprograma proa (programa de racionalizacion y optimizacion del uso deantimicrobianos) en un servicio de cirugıa general y digestivo,” Revista

de Cirugıa Espanola, vol. 94, no. Noviembre 2016, p. 126, November2016.

[32] A. Manuel-Vazquez, F. Palacios-Ortega, J. Garcıa-Septiem, I. J. Thuis-sard, D. Sanz-Rosa, J. Arias-Dıaz, J. M. Jover-Navalon, and J. M.Ramia, “Antimicrobial stewardship program in a department of surgery:human, electronic and methodological resources. results after threeyears,” Annals of Surgery, vol. (under evaluation), 2018.

[33] F. Palacios, M. Campos, J. M. Juarez, S. E. Cosgrove, E. Avdic,B. Canovas-Segura, A. Morales, M. E. Martınez-Nunez, T. Molina-Garcıa, P. Garcıa-Hierro, and J. Cacho-Calvo, “A clinical decision sup-port system for an Antimicrobial Stewardship Program,” in HEALTHINF

2016 - 9th International Conference on Health Informatics, Proceedings.Rome: SciTePress, 2016, pp. 496–501.

[34] M. Campos, B. Garcia, J. M. Juarez, J. M. Guillamon, and F. Palacios,“What do doctors need for effective adoption and integration of clinicalguidelines into daily practice?” in ICHI. IEEE Computer Society, 2014,pp. 247–255.

[35] A. Morales, B. Canovas-Segura, M. Campos, J. M. Juarez, and F. Pa-lacios, “Proposal of a Big Data Platform for Intelligent Antibiotic

Surveillance in a Hospital,” in Advances in Artificial Intelligence:

17th Conference of the Spanish Association for Artificial Intelligence,

CAEPIA 2016, Salamanca, Spain, September 14-16, 2016. Proceedings,O. Luaces, J. A. Gamez, E. Barrenechea, A. Troncoso, M. Galar,H. Quintian, and E. Corchado, Eds., 2016, pp. 261–270.

[36] A. Morales, M. Campos, J. M. Juarez, B. Canovas-Segura, F. Palacios,and R. Marin, “A decision support system for antibiotic prescriptionbased on local cumulative antibiograms,” J. of Biomedical Informatics,vol. 84, pp. 114–122, 2018.

[37] H. Garcia-Caballero, M. Campos, J. M. Juarez, and F. Palacios, “Vi-sualization in clinical decision support system for antibiotic treatment,”in Proceedings of the Conference of Spanish Society on Artificial

Intelligence (CAEPIA15), 2015, pp. 71–80.[38] B. Canovas-Segura, F. Zerbato, B. Oliboni, C. Combi, M. Campos,

A. Morales, J. M. Juarez, R. Marin, and F. Palacios, “A process-orientedapproach for supporting clinical decisions for infection management,” in2017 IEEE International Conference on Healthcare Informatics (ICHI),Aug 2017, pp. 91–100.

[39] B. Canovas-Segura, F. Zerbato, B. Oliboni, C. Combi, M. Campos, A. M.Nicolas, J. M. Juarez, F. Palacios, and R. Marın, “A decision supportvisualization tool for infection management based on BMPN and DMN,”in CITI, ser. Communications in Computer and Information Science, vol.749. Springer, 2017, pp. 158–168.

[40] N. Iglesias, J. M. Juarez, M. Campos, and F. Palacios, “Computablerepresentation of antimicrobial recommendations using clinical rules: Aclinical information systems perspective,” in IWINAC (1), ser. LectureNotes in Computer Science, vol. 9107. Springer, 2015, pp. 258–268.

[41] E. Lupiani, J. M. Juarez, J. Palma, and R. Marin, “Monitoring elderlypeople at home with temporal case-based reasoning,” Knowledge-

Based Systems, vol. 134, pp. 116 – 134, 2017. [Online]. Available:http://www.sciencedirect.com/science/article/pii/S0950705117303477

[42] E. Lupiani, J. M. Juarez, J. T. Palma, C. S. Sauer, and T. Roth-Berghofer,“Using case-based reasoning to detect risk scenarios of elderly peopleliving alone at home,” in ICCBR, ser. Lecture Notes in ComputerScience, vol. 8765. Springer, 2014, pp. 274–288.

[43] E. Lupiani, J. M. Juarez, and J. T. Palma, “A proposal of temporalcase-base maintenance algorithms,” in ICCBR, ser. Lecture Notes inComputer Science, vol. 8765. Springer, 2014, pp. 260–273.

[44] E. Lupiani, J. M. Juarez, F. Jimenez, and J. T. Palma, “Evaluating caseselection algorithms for analogical reasoning systems,” in IWINAC (1),ser. Lecture Notes in Computer Science, vol. 6686. Springer, 2011, pp.344–353.

[45] E. Lupiani, S. Craw, S. Massie, J. M. Juarez, and J. T. Palma, “Amulti-objective evolutionary algorithm fitness function for case-basemaintenance,” in ICCBR, ser. Lecture Notes in Computer Science, vol.7969. Springer, 2013, pp. 218–232.

[46] J. M. Juarez, J. M. Ochotorena, M. Campos, and C. Combi, “Multipletemporal axes for visualising the behaviour of elders living alone,” inICHI. IEEE Computer Society, 2013, pp. 387–395.

[47] J. M. Juarez, G. Garcia-Fernandez, M. Campos, B. Martinez, M. An-tequera, and C. Antunez, “Experiences on computerised neuropsycho-logical tests for dementia using a mobile touchable interface,” in ICHI.IEEE Computer Society, 2014, pp. 355–361.

[48] M. M. Antequera, M. T. Daza, F. Guil, J. M. Juarez, and G. Lopez-Crespo, “An architecture proposal for adaptive neuropsychological as-sessment,” in IWINAC (1), ser. Lecture Notes in Computer Science, vol.5601. Springer, 2009, pp. 426–436.