AI Projects

ValueError: The checkpoint you are trying to load has model type raptor but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

KoRaptor 150M

I only used a single Nvidia RTX 3090 GPU to pretrain a 150M Korean Language Model – fully from scratch. Built a tokenizer with SentencePiece, using LatentMoE architecture, gathering datasets and finetuned. Open-source on Huggingface, Github.

Small Language Model Computer died Pretrain

Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Subtype: KERN_INVALID_ADDRESS at 0x0000000000000063
Thread 1 Crashed:
0 libllama.dylib 0x0000000103e1b234 llama_model_load_from_file_impl + 0
1 libllama.dylib 0x0000000103e1b5ab llama_model_load_from_file + 43
2 Runner 0x0000000100b2f1d4 main + 36
3 dyld 0x0000000100c12f15 start + 1

Wait, However

Just tried to mount Llama3-2B model on iOS phone. Got bunch of errors about dynamic libraries, Metal and Memory allocation problems. Millions of failures, learned valuable things. + Text Classification model & Extraction Summarization

GGUF on iOS Memory Error Llama cpp

RuntimeError: Given groups=1, weight of size [64, 3, 7, 7], expected input[1, 24, 24] to have 3 channels, but got 1 channels instead.

Sealant Problem

Normal and Defect sealant classification problem. Accuracy skyrocketed to 98% using Vision Transformers, but TinyVGG also. There was a problems with training procedure, which as a result achieved 60% accuracy.

Shape mismatch ViT will work Progressing

RuntimeError: CUDA out of memory. Tried to allocate 200.00 MiB (GPU 0; 15.78 GiB total capacity; 14.56 GiB already allocated; 38.44 MiB free; 14.80 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

GPT from Scratch

Help! I accidentally built GPT from scratch! First paper read, second AI model I built. (The first AI model was ViT) Learning most of the things building from scratch.

Accidentally Built GPT