The adaptation of pretrained LLM to the medical domain

In this post, I aim to provide you with the technical details of how I adapted the natural language-based LLMs to longitudinal sparse numerical data of medical domain. Particularly, I will describe how I generated synthetic data to overcome the issues of data privacy and resource limitations; and the architecture of the adapted LLMs to bridge the gaps between the two domains. For details of applications and performance, please visit the previous post.

This is yet to be written. At the mean time you can visit my github repository for indepth instruction of how to install and run the pipeline. The paper is in production.