Computational Linguistics

Exploring the intersection of language, culture, and computation through the lens of East African linguistic heritage. Building tools that preserve, analyze, and celebrate the rich tapestry of human language diversity.

Mother Tongue: Somali
Cultural Focus: East Africa
Research Active

Research Areas

Language Preservation

Digital tools for documenting and preserving endangered East African languages

Cognitive Linguistics

How language structures influence computational thinking and problem-solving

Code as Language

Analyzing programming languages through the lens of natural language theory

Oral Traditions

Computational analysis of storytelling patterns and cultural knowledge

Active Projects

Somali Morphological Analyzer

Somali Morphological Analyzer

Active Research

A comprehensive tool for analyzing Somali word structure and morphology. Built to handle the complex agglutinative nature of the language with support for root extraction, affix analysis, and grammatical categorization.

Cultural Impact: Preserving the linguistic heritage of 20+ million Somali speakers worldwide

Key Insights:

  • Somali uses extensive vowel harmony patterns
  • Agglutination creates 50+ morphemes per word
  • Tonal variations affect semantic meaning
PythonNLTKTransformersFastAPI
Somali Oral Tradition Archive

Somali Oral Tradition Archive

Community Collaboration

Digital preservation system for oral narratives, proverbs, and traditional stories from East African cultures. Includes audio processing, transcription, and semantic analysis of cultural knowledge embedded in storytelling.

Cultural Impact: Preserving 1000+ years of oral knowledge from Horn of Africa communities

Key Insights:

  • Oral narratives contain complex temporal structures
  • Cultural metaphors require contextual understanding
  • Storytelling patterns vary by ethnic group
PythonWhisperspaCyPostgreSQL
Low-Resource Language Translation

Low-Resource Language Translation

Model Training

Building translation tools for underrepresented Somali languages using transfer learning and few-shot techniques. Focus on Somali, Oromo, Amharic, and Tigrinya language pairs.

Cultural Impact: Bridging communication gaps for 200+ million East Africans

Key Insights:

  • Transfer learning from Arabic improves Somali translation
  • Cultural concepts often lack direct translations
  • Dialectal variations require region-specific models
PyTorchTransformersOPUSSentencePiece
Computational Poetics Engine

Computational Poetics Engine

Research Phase

Analyzing the computational structure of traditional East African poetry and verse. Exploring rhythm, meter, and semantic patterns in Somali gabay, Ethiopian qene, and other poetic forms.

Cultural Impact: Digitizing 1000+ years of East African poetic traditions

Key Insights:

  • Somali gabay follows strict alliterative patterns
  • Rhythmic structures encode cultural memory
  • Poetic devices vary significantly across regions
PythonPhoneticsPattern RecognitionAudio Analysis
Cultural Concept Mapping

Cultural Concept Mapping

Ontology Development

Building semantic networks that capture culture-specific concepts and their relationships. Mapping untranslatable words and cultural knowledge systems from East African languages.

Cultural Impact: Preserving indigenous knowledge systems and worldviews

Key Insights:

  • Many cultural concepts have no English equivalents
  • Kinship terms encode complex social structures
  • Environmental knowledge is embedded in language
Neo4jKnowledge GraphsSemantic WebRDF

Research Philosophy

Language as Living Heritage

Every language encodes unique ways of understanding the world. My work focuses on preserving and celebrating this diversity, ensuring that computational tools serve all communities, not just dominant languages.

Technology for Cultural Preservation

Computational linguistics should empower communities to maintain their linguistic heritage. I build tools that are culturally aware, community-driven, and designed to strengthen rather than replace traditional knowledge systems.

Inclusive AI Development

AI systems trained only on dominant languages perpetuate inequality. My research explores how to build more inclusive models that understand and respect linguistic diversity, especially for low-resource languages.

Code as Cultural Expression

Programming languages are not culturally neutral. I investigate how our native languages influence our coding patterns, problem-solving approaches, and the systems we build, celebrating this diversity in computational thinking.

Collaborate & Connect

Interested in computational linguistics, cultural preservation, or multilingual AI? Let's explore the beautiful complexity of human language together.