Open-source AI benchmark

Which AI model knows
Appwrite best?

Sorted by
ModelCost/1MOverallFundamentalAuthDatabasesFunctionsStorageSitesMessagingRealtimeCLI
Last update: Mar 4, 2026

Beyond the leaderboard

A closer look at how each model performs across every dimension we test.

Loading charts...

Ask

We ask each model 180 questions across 9 Appwrite services, to help showcase impact AI can have on your project.

Answer

4 models respond with and without Appwrite docs, so we test both raw knowledge and assisted intelligence.

Score

Answers are scored deterministically using multi-choice and by an AI judge for open-ended questions.

Evaluation modes

Context changes everything

We test each model with and without Appwrite's docs. The gap reveals how well it leverages context.

With Skills.md

How do I create a document?

Here's how to create a document using the Appwrite SDK:

import { Client, Databases, ID } from 'appwrite';

const client = new Client()
  .setEndpoint('https://<REGION>.cloud.appwrite.io/v1')
  .setProject('<PROJECT_ID>');

const databases = new Databases(client);
await databases.createDocument({
  databaseId: '<DATABASE_ID>',
  collectionId: '<COLLECTION_ID>',
  documentId: ID.unique(),
  data: { title: 'Hello', content: 'World' }
});
Without Skills.md

How do I create a document?

Sure, here's how you can create a document in Appwrite:

import Appwrite from 'appwrite';

const appwrite = new Appwrite();
appwrite.setEndpoint('https://appwrite.io/v1');

// Removed in Appwrite 1.0
await appwrite.database.createDocument(
  'collectionId', { title: 'Hello' }
);

Benchmarking the full Appwrite platform

Every question is drawn from actual Appwrite platform usage, covering all the services.

Fundamental

Core concepts, SDKs, permissions, and platform basics

Auth

Authentication methods, user management, and sessions

Databases

Collections, documents, queries, and relationships

Functions

Serverless functions, runtimes, and execution

Storage

File uploads, buckets, and file management

Sites

Static site hosting, domains, and deployments

Messaging

Push notifications, SMS, email, and providers

Realtime

WebSocket subscriptions, channels, and live events

CLI

CLI installation, configuration, and deployment workflows

Scoring methods

Fair and predictable scoring

We score every answer twice, once for accuracy, once for quality.

Deterministic (MCQ)

Each model answers 154 multiple-choice questions, one correct answer, no room for interpretation.

Fully reproducibleNo judge biasFactual recall only

AI-Judged (Open-ended)

26 open-ended questions scored 0–1 by an AI judge against a rubric and reference answer.

Tests reasoningReal-world usageSlight variance

Fully open source

Every question, answer, and score is public.
Fork it, run it, improve it.

View on GitHub