"sql.php" - my own ORM in 400 lines of PHP

PHP ORM databases SQL explanation programming

sql.php is a simple “active record”-like ORM written in PHP. It’s all contained into a single file, that goes by name of “sql.php”, has an extensible type system, soft deletions, CRUD operations, and manages to do all this with three classes: \SQL\DB, \SQL\Types and \SQL\Record.

I’ll start by making a short demonstration and then I’ll continue by going through the code, explaining why it does what it does and writing about future features I want to add.

require_once "sql.php";

class User extends \SQL\Record {
    public string $username;
    public string $password;

    public static function tableName() {
        return "users";
    }
}

$u = new User;
$u->username = "mark";
$u->password = "test";
$u->create();

I’ll also demonstrate how foreign keys work, but for anything more than that, you’d be better off consulting with the documentation.

// continuation of the code above
class Post extends \SQL\Record {
    public string $title, $content;
    public bool $public;
    public User $author;
}

$p = new Post;
$p->title = "Writing my own ORM in 400 lines of PHP";
$p->content = "bla bla bla";
$p->author = $u;
$p->create();

General code map

This will be a schematic view of the whole codebase, in order to better understand what we’re working with.

namespace SQL;

class DB {
    private static $instance;
    private static $dsn, $user, $pass;
    public static function getInstance();
    /* singleton stuff ... */
}

class Types {
    /* singleton stuff ... */

    private static $types = [];

    public static function add($nativeName, $sqlType, $serializer_cb, $unserializer_cb);
    public static function remove($nativeName);
    public static function get($nativeName);
    public static function getSqlType($nativeName);
}

\SQL\Types::add("string", "MEDIUMTEXT", NULL, NULL);
\SQL\Types::add("int", "MEDIUMINT", NULL, NULL);
/* ... declaring other default types */

class Record {
    public int $id;
    public \DateTime $created_at;
    public \DateTime $updated_at;
    public \DateTime $deleted_at;

    public static function tableName() {
        return strtolower(get_called_class());
    }

    public static function register();
    public function clear();
    public function create();
    public function update();
    public function delete($soft = true);
    public function fetch();
    public function fetchAll();

    private static function getColumns();
    private function getInitializedColumns();
    private static function getTypeFor(string $columnName);
    private static function isReference(string $columnName);
    private static function exec($commands);
}

From the inside out: the private functions

In the following article, I’ll use the terms “model”, “record” and “table” interchangeably. They will mostly refer to the SQL\Record class or to a SQL table.

The only variables a record class will have, will correspond to a SQL column. That being said, we now know that we can get the columns of a record using the following PHP snippet:

private static function getColumns() {
    $ref = new \ReflectionClass(static::class);
    $properties = $ref->getProperties(\ReflectionProperty::IS_PUBLIC);
    return array_map(function ($p) {
        return $p->name;
    }, $properties);
}

Note that static::class gives us the name of the class as a string. Remeber that, as it’ll be used extensively by the library.

You’ll also see the getInitializedColumns function, which is used by the CRUD operations/functions (create, fetch, fetchAll, update, delete) to build their SQL queries, while avoiding columns that have no role in the query because they’re empty. By only using initialized (also non-NULL) variables, we can skip a lot of error checking, because we know we are only working with non-NULL variables.

The getTypeFor is pretty self-explanatory, but a thing to remember is that, in case that the type is a class, the type name will be the name of the class.

Foreign keys

We know how to declare foreign keys, but let’s see how the library detects and works with them.

class User extends \SQL\Record {}
class Post extends \SQL\Record {
    public string $title;
    public User $author;
    public DateTime $created_at;
}

As a library, how would you tell the difference between the $author and $created_at variables? You must turn the $created_at variable into a SQL TIMESTAMP column, and the $author variable, into SQL INT with a foreign key constraint.

It’s simple. User extends \SQL\Record, while DateTime does not. That’s how the isReference function works:

private static function isReference(string $columnName) {
    return is_subclass_of(static::getTypeFor($columnName), "\SQL\Record");
}

Registering a model

Before we can use the models, we must, at least once, run the register function in order to create the tables.

User::register();

Now let’s inspect the SQL statements it generated.

CREATE TABLE IF NOT EXISTS user (
    id INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    deleted_at TIMESTAMP
);
ALTER TABLE user ADD username MEDIUMTEXT;
ALTER TABLE user ADD password MEDIUMTEXT;

The first statement is the same for all tables. The other lines are generated by looping over all the columns and getting their native type (string, int, bool, float), then using a lookup table provided by the type system to get the respective SQL type.

Post::register();

outputting

CREATE TABLE IF NOT EXISTS posts (
    id INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    deleted_at TIMESTAMP
);
ALTER TABLE posts ADD title MEDIUMTEXT;
ALTER TABLE posts ADD content MEDIUMTEXT;
ALTER TABLE posts ADD author_id INT;
ALTER TABLE posts ADD CONSTRAINT fk_posts_user FOREIGN KEY(author_id) REFERENCES user(id);

By using the internal function described before, we use this snippet to generate the SQL:

$cols = array_diff(
    static::getColumns(),
    ["id", "created_at", "updated_at", "deleted_at"]
);

foreach ($cols as $c) {
    if (static::isReference($c)) {
        // The two lines below are very important because,
        // they effectively allow the user to change table names.
        $class_name = static::getTypeFor($c);
        $ref_table_name = $class_name::tableName();
        // push sql statements to array ...
    } else {
        // push sql statements to array ...
    }
}

The Record::exec function is finally used to execute the list of SQL statements.
Before going any further, we must necessarily understand how the type system works.

The type system

We, as users of the library, use two type of types: native types and complex types. Internally they’re all the same.

A type name is the return value of the PHP gettype function. Each type has a corresponding SQL type and has two associated functions:

  • the serialization function which takes the object and turns him into a valid SQL value that can be stored and used by the database
  • the “unserialization” function which takes the data from the database and reconstructs the object

If you look at the code map, you’ll see the Types::add function being used to declare default types.

// Type declaration example

\SQL\Types::add(
    "DateTime", "TIMESTAMP",
    function($value) {
        // turning the DateTime object into a
        // string that can be stored as a TIMESTAMP
        return $value->format('Y-m-d H:i:s');
    },
    function($value) {
        // turning the string back into the original object
        return \DateTime::createFromFormat('Y-m-d H:i:s', $value);
    }
);

For native types (int, string, etc) you’ll see that the (un)serialization functions are null. That is because they already have corresponding SQL types.

CRUD Operations

Create

$u = new User;
$u->username = "mark";
$u->create();

The code of the Record::create function is really well documented and easy to follow, but I will focus on the trick that makes it possible to create dynamic parameters for prepared SQL statements.

The statements we prepare has the following template:

INSERT INTO $name ($columns) VALUES ($col_values);

$columns is a string that is the result of joining the array of initiated columns.

$col_values is also a string, but it takes the following form:

:val_col1, :val_col2, :val_col3

Replace col1, col2 and col3 with actual column names and you get the general picture.

These parameters are later “filled” by the PDO bindValue function using the following snippet:

$stmt = \SQL\DB::getInstance()->prepare(
    "INSERT INTO $name ($columns) VALUES ($values)"
);

$cols = array_diff($this->getInitializedColumns(), ["id", "updated_at", "deleted_at"]);
foreach ($cols as $c) {
    if ($this::isReference($c)) {
            // observe
            $stmt->bindValue(":val_".$c, $this->{$c}->id);
        } else {
            $type = \SQL\Types::get($this::getTypeFor($c));
            $serialized = call_user_func($type["serializer"], $this->{$c});
            // yes, observe
            $stmt->bindValue(":val_".$c, $serialized);
        }
    }
}

$stmt->execute();

Update

The update function works similarly, but the SQL statement, generated by this snippet,

$values = join(", ", array_map(
    function ($c) { return sprintf("%s = :val_%s", $c, $c); },
    $this->getInitializedColumns()
));

takes the following form:

col1 = :val_col1, col2 = :val_col2

Fetch (and fetchAll)

The interesting part of this function is not the SQL statement part, which is really similar to the ones discussed above, but the way in which the result is returned and its fields “unserialized”.

$result = $stmt->fetchAll(\PDO::FETCH_ASSOC);
foreach ($result as $key => $value) {
    if (isset($result[$key])) {
        if ($this::isReference($key)) {
            // If the column is a foreign key, then
            // the column type must be a class that
            // extends the Record class.
            $obj = new ($this::getTypeFor($key));
            $obj->id = $value;
            $obj->fetch();
            $this->{$key} = $obj;
            // This basically takes the foreign key,
            // also the ID of the references model
            // and fetches it.
        } else {
            $type = \SQL\Types::get($this::getTypeFor($key));
            $unserialized = call_user_func($type["unserializer"], $value);
            $obj->{$key} = $unserialized;
        }
    }
}

The fetchAll does the process above for a list of SQL rows and returns an array.

And that’s how my ORM works.