Differences between structured databases and unstructured information in the research of RA
Characteristic | Structured databases | Unstructured information |
---|---|---|
Definition | Uses diagnostic codes and predefined formats | Found in free text or images |
Data source | Claims; prescriptions and administrative databases | Clinical notes; imaging data |
Data collection | International Classification of Diseases, 9th edition (ICD-9), ICD-10 codes | Natural language processing (NLP) for text; convolutional neural network (CNN) for imaging |
Examples of RA research | Detailed study of comorbidities; treatment safety | Identification of RA patients; extraction of outcome measures |
Limitations | Limited by predefined formats; requires systematic coding; possible missing variables and biases | Analytical challenges; require precision in data detection; design challenges in algorithms |
Benefits | Systematic and standardized data; detection of long-term trends; prevalence in broad populations | Enhances collection of specific features; contributes to multimodal research |
The authors thank the Spanish Foundation of Rheumatology for providing medical editorial assistance during the preparation of the manuscript.
DB: Conceptualization, Writing—original draft, Writing—review & editing. CPR: Conceptualization, Writing—review & editing, Validation. Both authors read and approved the submitted version.
DB reports speakers bureau/grants from AbbVie, Lilly, MSD, Pfizer, UCB, and Novartis outside of the submitted work. He is a part-time worker at Savanamed. CPR reports speakers bureau/grants from Abbvie, Pfizer, Novartis, Lilly, and Roche outside of the submitted work.
Not applicable.
Not applicable.
Not applicable.
Not applicable.
Not applicable.
© The Author(s) 2024.