13 2 14

AbstractPhila PRO

AbstractPhil

https://civitai.com/user/AbstractPhila

AbstractEyes

AI & ML interests

datasets, research papers, experimentation, vision, classification, text encoders, tokenization, llms, diffusion, distillation, and more.

Recent Activity

updated a dataset about 1 hour ago

AbstractPhil/bulk-coco-features

published a dataset about 1 hour ago

AbstractPhil/bulk-coco-features

updated a dataset about 2 hours ago

AbstractPhil/dataset-code-test

View all activity

Organizations

updated a dataset about 1 hour ago

AbstractPhil/bulk-coco-features

Updated about 1 hour ago

published a dataset about 1 hour ago

AbstractPhil/bulk-coco-features

Updated about 1 hour ago

updated a dataset about 2 hours ago

AbstractPhil/dataset-code-test

Viewer • Updated about 2 hours ago • 120

published a dataset about 2 hours ago

AbstractPhil/dataset-code-test

Viewer • Updated about 2 hours ago • 120

commented on The Optimal Architecture for Small Language Models about 2 hours ago

I like this idea. I've written a couple mini llms based on GPT2 named Beeper. They were a bit too different and built on a couple principles that seemed strong, but didn't necessarily hold up to the pressure yet. Since then I've mothballed the idea but I plan to revisit the plan sooner now that I found your article here.

posted an update 1 day ago

Post

218

Happy Holidays all! geofractal architectural expansions; timm is now a core component for experimenting. As it stands, the system is growing rapidly in one direction, and timm brings a whole lot to the table in another rapid-prototyping direction. Therefore, timm is now a core component for ease-of-use.

BaseUtil is a new core component; aka src.geofractal.router.base_util inherits BaseComponent's behavior, so it should allow device movement for util operations which will direct utilization for device-to-device behavior for the upcoming accelerate integration.

I'm trying to mitigate the base component structure as much as possible, but the need to chain components in specific orders presented a unique problem. By compartmentalizing utils into structures that can be delegated and moved, these structures can be repurposed, expanded autonomously, reduced autonomously, and more.

ChainComponent inherits a subsystem specifically designed to organize multi-system multi-device formulas designated for inception and synchronization purposes. This is meant to allow distributed tasking to multiple-devices in chained utilization. This also enables ease-of-integration into nn.ModuleList with a few other caveats that will be ironed out meant to target wide-distributed models.

FusionComponent is specifically dedicated to the new fusion processing system meant for experimental expansion. This includes sub-module schedule control, Component and Tower functional control, device-movement, and will be packaged under the term "gfu.UtilType" as a standard naming convention.
"gfc.ComponentTypeName"
"gfr.RouterTypeName"
"gfu.UtilityTypeName"
"gft.TowerTypeName"
All of which are basically just import thing as.
"gf.AnythingTopLevelPackaged" which will include the core.

Better debugging for compilation
I'm in prototyping phases of a better debugging for compiled wide models and will prepare a baseline component readout structure by the end of the day today or tomorrow.

published an article 3 days ago

Article

Geometric Manifold Walking: Stable High-Accuracy Multi-Encoder Fusion Without Backbone Training

3 days ago

•

replied to their post 5 days ago

I'm currently updating the core architecture and WideRouter compiler capacity while simultaneously fixing a cache related bug. With that I've dedicated a specific part of all routers to caching; which is intentionally meant to be device agnostic and clearable on a whim throughout a pod.

Full ramp-up for accelerate testing is crucial very soon. I require easy access to multiple runpod-friendly, clustered docker, or indirectly connected devices with an api - to properly integrate the debugging and testing infrastructure. Currently the debugging infrastructure on the compiler is essentially nonexistent, it simply crashes with a huge readout that does not assist with debugging - so the only way to get the real debugged error is by simply turning off compilation to run the test without it. This is FINE without compilation, but once compiled the debugging is difficult to read. This can cut into time if you are unaware of the quirk and I want to ensure this isn't visible by catching all necessary elemental exceptions based on the interactions and device movements.

If I can get direct access to a docker system with multiple devices - cpu, gpu, interface, data, raw, tensor, numpy, whatever; I will gladly make a full wrapper structure to directly intermingle with diffusers, pytorch, or perhaps tensorflow alongside the structural system that I'm currently implementing. Please, share hardware. I'll gladly share my engineering and innovations. I'll focus my direct attention on whichever goal you wish first, then move from there.

If anyone is willing to donate hardware for a month or two I would be obliged to focus my attention on the needs of that someone or entity. I would happily share time slots and utilization days or months with others. So long as I can get the architecture working to capacity to provide the necessary implications and training to the models required. My only motivation is to directly train these experiments to match the hardware access and accuracy with deeper models, while simultaneously training multiple adjacent models in a stable fashion.

This is one of my primary goals, so reach out directly abstractpowered@outlook if you have any information. Access to stronger hardware will allow me to directly ramp up ablation tests and accuracy augmentations to machines that currently operate lesser than sota, while enabling the potential for the same complexity as a deep model training with many devices all tasked with different goals.

This is not an easy task so it will take time. Sharing resources, ideas, and concepts of utility I encourage as well. You can find me on discord as

_abstract_
aka discord name AbstractPhila with the popfrog display image, and in the Res4lyfe discord, most often frequenting the sampler madness channel as well. I have a fair history of chatting there and a few other locations for bouncing ideas and keeping daily notes.

updated a model 5 days ago

AbstractPhil/beatrix-diffusion-proto

Updated 5 days ago • 364

posted an update 5 days ago

Post

258

geofractal getting started guide available, bulk ablation for fusion, simple towers, oscillator capacity, and substructure systemic associative capacity.
Many formulas were tested, 92 tests for collectives, oscillation bulk experiments, and more. All of them either coalesce into the correct behavior or the failures are directly visible, which means the system is robust enough to declare some tools functionally valid but not scalable yet.

ai-crash course available;
https://github.com/AbstractEyes/geofractal/blob/main/ai_helpers/v101_claude_helpers.txt
Feed GPT, Claude, or Grokk and they will assist.

getting started guide;
https://github.com/AbstractEyes/geofractal/blob/main/src/geofractal/router/GETTING_STARTED.md

geofractal router architecture is in prototype phases;
https://github.com/AbstractEyes/geofractal

This is likely one of it's final growing phases before full production capacity is ramped up. The architecture is not for the novice, it's meant for experts to either get ideas, borrow code, utilize library capacity, or simply tell AI what to do. MOST files in current production have good descriptions for AI integration.

Transfer learning notebook available here;
https://github.com/AbstractEyes/geofractal/blob/main/src/geofractal/router/Router_Transfer_Learning-12_19_25.ipynb

Stress test and multiple diagnostics available here;
https://github.com/AbstractEyes/geofractal/blob/main/src/geofractal/router/components/diagnostics/

WideRouter compilation capacity available;
https://github.com/AbstractEyes/geofractal/blob/main/src/geofractal/router/wide_router.py

The wide router compiler organizes similar towers into stacked staged combinations before compiling with torch.compile. This is experimental, but has shown increased speed with multiple structures of wide models and will serve it's purpose in the future.

1 reply

updated 2 models 5 days ago

AbstractPhil/beatrix-diffusion-proto

Updated 5 days ago • 364

AbstractPhil/beatrix-diffusion-proto

Updated 5 days ago • 364

AbstractPhila PRO

AI & ML interests

Recent Activity

Organizations

AbstractPhil's activity

Geometric Manifold Walking: Stable High-Accuracy Multi-Encoder Fusion Without Backbone Training