Data modeling
What is data modeling?
The term data modeling refers to the process of defining the shape and structure of the objects in an application, these objects are often called "application models". In relational databases (like PostgreSQL), they are stored in tables . When using document databases (like MongoDB), they are stored in collections.
Depending on the domain of your application, the models will be different. For example, if you're writing a blogging application, you might have models such as blog, author, article. When writing a car-sharing app, you probably have models like driver, car, route. Application models enable you to represent these different entities in your code by creating respective data structures.
When modeling data, you typically ask questions like:
- What are the main entities/concepts in my application?
- How do they relate to each other?
- What are their main characteristics/properties?
- How can they be represented with my technology stack?
Data modeling without Prisma ORM
Data modeling typically needs to happen on (at least) two levels:
- On the database level
- On the application level (i.e., in your programming language)
The way that the application models are represented on both levels might differ due to a few reasons:
- Databases and programming languages use different data types
- Relations are represented differently in a database than in a programming language
- Databases typically have more powerful data modeling capabilities, like indexes, cascading deletes, or a variety of additional constraints (e.g. unique, not null, ...)
- Databases and programming languages have different technical constraints
Data modeling on the database level
Relational databases
In relational databases, models are represented by tables. For example, you might define a users
table to store information about the users of your application. Using PostgreSQL, you'd define it as follows:
CREATE TABLE users (
user_id SERIAL PRIMARY KEY NOT NULL,
name VARCHAR(255),
email VARCHAR(255) UNIQUE NOT NULL,
isAdmin BOOLEAN NOT NULL DEFAULT false
);
A visual representation of the users
table with some random data might look as follows:
user_id | name | email | isAdmin |
---|---|---|---|
1 | Alice | alice@prisma.io | false |
2 | Bob | bob@prisma.io | false |
3 | Sarah | sarah@prisma.io | true |
It has the following columns:
user_id
: An integer that increments with every new record in theusers
table. It also represents the primary key for each record.name
: A string with at most 255 characters.email
: A string with at most 255 characters. Additionally, the added constraints express that no two records can have duplicate values for theemail
column, and that every record needs to have a value for it.isAdmin
: A boolean that indicates whether the user has admin rights (default value:false
)
MongoDB
In MongoDB databases, models are represented by collections and contain documents that can have any structure:
{
_id: '607ee94800bbe41f001fd568',
slug: 'prisma-loves-mongodb',
title: 'Prisma <3 MongoDB',
body: "This is my first post. Isn't MongoDB + Prisma awesome?!"
}
Prisma Client currently expects a consistent model and normalized model design. This means that:
- If a model or field is not present in the Prisma schema, it is ignored
- If a field is mandatory but not present in the MongoDB dataset, you will get an error
Data modeling on the application level
In addition to creating the tables that represent the entities from your application domain, you also need to create application models in your programming language. In object-oriented languages, this is often done by creating classes to represent your models. Depending on the programming language, this might also be done with interfaces or structs.
There often is a strong correlation between the tables in your database and the models you define in your code. For example, to represent records from the aforementioned users
table in your application, you might define a JavaScript (ES6) class looking similar to this:
class User {
constructor(user_id, name, email, isAdmin) {
this.user_id = user_id
this.name = name
this.email = email
this.isAdmin = isAdmin
}
}
When using TypeScript, you might define an interface instead:
interface User {
user_id: number
name: string
email: string
isAdmin: boolean
}
Notice how the User
model in both cases has the same properties as the users
table in the previous example. While it's often the case that there's a 1:1 mapping between database tables and application models, it can also happen that models are represented completely differently in the database and your application.
With this setup, you can retrieve records from the users
table and store them as instances of your User
type. The following example code snippet uses pg
as the driver for PostgreSQL and creates a User
instance based on the above defined JavaScript class:
const resultRows = await client.query('SELECT * FROM users WHERE user_id = 1')
const userData = resultRows[0]
const user = new User(
userData.user_id,
userData.name,
userData.email,
userData.isAdmin
)
// user = {
// user_id: 1,
// name: "Alice",
// email: "alice@prisma.io",
// isAdmin: false
// }
Notice that in these examples, the application models are "dumb", meaning they don't implement any logic but their sole purpose is to carry data as plain old JavaScript objects.
Data modeling with ORMs
ORMs are commonly used in object-oriented languages to make it easier for developers to work with a database. The key characteristic of an ORM is that it lets you model your application data in terms of classes which are mapped to tables in the underlying database.
The main difference compared to the approaches explained above is these classes not only carry data but also implement a substantial amount of logic. Mostly for storage, retrieval, serialization, and deserialization, but sometimes they also implement business logic that's specific to your application.
This means, you don't write SQL statements to read and write data in the database, but instead the instances of your model classes provide an API to store and retrieve data.
Sequelize is a popular ORM in the Node.js ecosystem, this is how you'd define the same User
model from the sections before using Sequelize's modeling approach:
class User extends Model {}
User.init(
{
user_id: {
type: Sequelize.INTEGER,
primaryKey: true,
autoIncrement: true,
},
name: Sequelize.STRING(255),
email: {
type: Sequelize.STRING(255),
unique: true,
},
isAdmin: Sequelize.BOOLEAN,
},
{ sequelize, modelName: 'user' }
)
To get an example with this User
class to work, you still need to create the corresponding table in the database. With Sequelize, you have two ways of doing this:
- Run
User.sync()
(typically not recommended for production) - Use Sequelize migrations to change your database schema
Note that you'll never instantiate the User
class manually (using new User(...)
) as was shown in the previous section, but rather call static methods on the User
class which then return the User
model instances:
const user = await User.findByPk(42)
The call to findByPk
creates a SQL statement to retrieve the User
record that's identified by the ID value 42
.
The resulting user
object is an instance of Sequelize's Model
class (because User
inherits from Model
). It's not a POJO, but an object that implements additional behavior from Sequelize.
Data modeling with Prisma ORM
Depending on which parts of Prisma ORM you want to use in your application, the data modeling flow looks slightly different. The following two sections explain the workflows for using only Prisma Client and using Prisma Client and Prisma Migrate.
No matter which approach though, with Prisma ORM you never create application models in your programming language by manually defining classes, interfaces, or structs. Instead, the application models are defined in your Prisma schema:
- Only Prisma Client: Application models in the Prisma schema are generated based on the introspection of your database schema. Data modeling happens primarily on the database-level.
- Prisma Client and Prisma Migrate: Data modeling happens in the Prisma schema by manually adding application models to it. Prisma Migrate maps these application models to tables in the underlying database (currently only supported for relational databases).
As an example, the User
model from the previous example would be represented as follows in the Prisma schema:
model User {
user_id Int @id @default(autoincrement())
name String?
email String @unique
isAdmin Boolean @default(false)
}
Once the application models are in your Prisma schema (whether they were added through introspection or manually by you), the next step typically is to generate Prisma Client which provides a programmatic and type-safe API to read and write data in the shape of your application models.
Prisma Client uses TypeScript type aliases to represent your application models in your code. For example, the User
model would be represented as follows in the generated Prisma Client library:
export declare type User = {
id: number
name: string | null
email: string
isAdmin: boolean
}
In addition to the generated types, Prisma Client also provides a data access API that you can use once you've installed the @prisma/client
package:
import { PrismaClient } from '@prisma/client'
// or
// const { PrismaClient } = require('@prisma/client')
const prisma = new PrismaClient()
// use inside an `async` function to `await` the result
await prisma.user.findUnique(...)
await prisma.user.findMany(...)
await prisma.user.create(...)
await prisma.user.update(...)
await prisma.user.delete(...)
await prisma.user.upsert(...)
Using only Prisma Client
When using only Prisma Client and not using Prisma Migrate in your application, data modeling needs to happen on the database level via SQL. Once your SQL schema is ready, you use Prisma's introspection feature to add the application models to your Prisma schema. Finally, you generate Prisma Client which creates the types as well as the programmatic API for you to read and write data in your database.
Here is an overview of the main workflow:
- Change your database schema using SQL (e.g.
CREATE TABLE
,ALTER TABLE
, ...) - Run
prisma db pull
to introspect the database and add application models to the Prisma schema - Run
prisma generate
to update your Prisma Client API
Using Prisma Client and Prisma Migrate
When using Prisma Migrate, you define your application in the Prisma schema and with relational databases use the prisma migrate
subcommand to generate plain SQL migration files, which you can edit before applying. With MongoDB, you use prisma db push
instead which applies the changes to your database directly.
Here is an overview of the main workflow:
- Manually change your application models in the Prisma schema (e.g. add a new model, remove an existing one, ...)
- Run
prisma migrate dev
to create and apply a migration or runprisma db push
to apply the changes directly (in both cases Prisma Client is automatically generated)