MayBMS (pronounced “maybe-MS”) is an open-source, state-of-the-art probabilistic database management system built directly inside the PostgreSQL server backend. It serves as a highly scalable architecture designed to manage data uncertainty, handle complex data cleaning tasks, and process advanced probabilistic queries using extended relational algebra. Core Architecture and Features
Unlike traditional databases that assume every data point is 100% accurate, MayBMS natively manages incomplete, ambiguous, or imprecise information.
U-Relations Backend: MayBMS uses a representation system called U-relations (underlined relations). It decomposes complex, massive joint probability distributions into localized, manageable tables. This lets the system scale alongside standard Postgres operations.
The “Possible Worlds” Framework: The database evaluates data using a Possible Worlds model. Instead of storing a single static dataset, it conceptually tracks multiple valid variations (worlds) of the dataset in parallel, computing the likelihood of each outcome.
Postgres Compatibility: Because it modifies the Postgres source code directly, MayBMS identifies itself to external clients as a normal Postgres server. Standard middleware and applications can connect to it natively without structural re-engineering. Advanced Data Cleaning Capabilities
Data cleaning in real-world pipelines often forces engineers to arbitrarily pick a value when records conflict or drop missing records entirely. MayBMS replaces these brittle assumptions with mathematical probability: User Manual – MayBMS – SourceForge
Leave a Reply