My name is James, @techninja on Github, and I'm a long time HN watcher and learner. Just this past 6 months I decided crazily to follow some random thought I had about trying to do anything with my DNA data, now I guess it kinda works good enough to show ya? Let's see what you think. It's an open source, client-side polygenic scoring tool and imputation pipeline for common traits (64 right now), see data pipeline at https://github.com/techninja/asili-lab.
When I thought of this thing as somethign that people could use, I clearly saw that no one is willing to go find their file just to hand it to some server where it could be linked to your online tracking IDs and their own trait research. This kinda flips that on its head, I did all the work to compile the PGS variant collections for each trait as listed on the PGS catalog, a single file per trait split by chromosome is scored directly (with fast liftover from hg19->hg38 positions), letting you score your local DNA against gigabytes of weighted variants for all of the identified traits through cheap cloudflare R2 CDN byte range sipping (and hosted on cloudflare pages free tier!).
The benefits of the architecture are pretty obvious as far as simplified hosting and prep for me, but the benefits for the user are huge without needing to risk data breech or willful corporate data selling. Their data stays put locally for them, their mobile phone or laptop is the compute through DuckDB-wasm and Using a 10 year old machine and a single core I can score 64 traits for a user in ~3 hours.
The data pipeline lab work also has my best bet at a near identical output and process as Michigan Imputation Server (https://imputationserver.sph.umich.edu/), running locally from downloaded publically available genome data (~300-500gb?). I'm offering to run the pipeline pre-configured in AWS for a little cost (https://impute.asili.dev/about), my first personal AWS and Stripe integration project! Scare-cited for sure.
Even worse, I couldn't stand to put yet another React project out into the world, so I collected all the crap I'd learned doing web dev since the 90s to make some kind of modern, no build, collection of ideas I call Clearstack (https://clearstacks.org/). Asili, and of these sites I maintain (and my personal blog) are running this scaffolding spec framework to great success, but it might still be a stupid idea, so don't go jumping on the wagon yet.
I am a solo maintainer, and I'm sharing this as a beta to gather feedback from bioinformaticians, developers, and self-hosters. The repository is here: https://github.com/techninja/asili
I would love feedback on: * Any bottlenecks or issues for certain machines. * Thoughts on no-build dev architecture * Any related projects I can look into
I'll be around to answer questions, handle technical feedback, and review suggestions!