Google is very interested in knowing about whatever you're interested in, and in knowing when, how often, and for how long you're interested in those things. In addition to looking at their search engine, their ads, and their recommendations, you're also feeding them more and more data about you.
Respectfully, please, this below is an absolute joke--has it changed in a decade?
Image: Apple provides easy changing for Google, Yahoo, Bing, DuckDuckGo, and Ecosia. (That poor paid search engine that starts with a K has to mess around with extensions; I'm unaffiliated.)
Plus, even though I have it set to DuckDuckGo, when I ask Siri to "search {query}", it searches Google. So even the default I have set is not actually truly the all-around default. Embarrassing to have locked this down as if I didn't drop a grand on the phone.
-
Fun fact: for years now, asking Siri to "search Google Images" results in whitelabeled Bing Images (thankfully, exceptionally easy to remedy with the excellent Shortcuts: "Picture-Search" for Google, quality difference night & day unfortunately... anyway go SearXNG!, it lets you keep your soul).
Although usually a bottom-up approach using automatically updating `Map of Content` notes (Bases) work well for me for finding content.
For example, Syncthing on Debian notes [1] or using Spleeter AI to remove background sound from a long audio track [2]. This is why I switched back from static site to a Wordpress-like site [3], so that I can quickly publish notes from my phone.
[1]: https://huijzer.xyz/posts/149/setup-a-syncthing-service-on-d...
[2]: https://huijzer.xyz/posts/146/installing-and-running-spleete...
Whenever I do something and realize I might need it in the future, I just store it on corresponding projects.
Seems to be serving well to me for some time.
Rather than just coalescing to markdown files, the memory-zet plugin looks for actionable durable information and files it inside the existing zettelkasten system with embeddings - a quick no-LLM step (well 300m parameter query embed, it’s fast) is run against incoming chats or as a tool - this returns cards (zettels).
Zettels are somewhat unique in that the original methodology included a post-writing categorization and linking step - I have the system doing this as well. Result - cards can give you a (possibly cyclic) directed graph of connectivity. I built it for ‘centaur’ mode, so I can edit, link, unlink, move, etc through a nice little web interface.
The auto links are not the same quality I would make. But they are genuinely useful; upshot is for anything incoming, the LLM can see information directly about the query (if we have it), stuff that relates whether or not it embeds similarly, and can follow up links if they look promising with a fast tool call.
I made this memory system my daily driver yesterday; so far it is a significant improvement over the core memory extension (write to markdown files, don’t worry about compaction bro, it will be fine)!
It’s already building out people and organizational card bases for things that come in via email and whatsapp - this is a dream, basically. I think it will scale over time - but it’s at least scaling nicely over a few days of work right now.
I think the essay will be something like: adding structure post-hoc lets you build intelligence into the datastore as an architectural matter, not just rely on connections being made during use-time inference, using an embedding with links like this is much different than bulk embedding search, and we need some sort of tests to understand if this helps in practice, although it a) feels pretty good and b) it’s VERY nice to be able to refer to and modify the agents “mid term” memory directly in any event.
Anyway you’ve triggered me enough to say I’ll try and get the repo published today so people can look at it.
I haven't found a way to automate this import of my data, but most of the magic is in the history not in the present. It really is incredible. I'll ask the claw to find what I said about the SFPD cruiser I once saw in the TL and boom! It's there! A mild annoyance with using my Mediawiki-based blog (which I chose because it has good support for allowing users to edit it) is that authoring is still a lot of work and I keep forgetting Draft namespace articles.
Anything else is a bandaid.
Some 80,000+ files in a directory represents an awesome database of knowledge. "$ ls inux" to find anything Linux-related, etc.
One of these days I'll get around to setting up some ML tool that will tell me all the things I didn't already osmose from the archive .. and maybe long after I'm gone, in some hole in a wall of some grimy back alley somewhere, there'll be a ML version of me embedded in a brick, ready to have the conversation well into the future ..
Am I the only one who gets physically ill listening to themselves speak? =)
Tried Evernote and tagging and so on and it turns out cataloging stuff is hard, and the lazy recourse is to over-tag, and then I end up doing a brute force search.