PostgreSQL does support a lot of security features that are often not applied to use cases. Such features include the ability for column-level and row-level security constraint. This article presents a simple use case to allow readers to better understand the capabilities of this great database.
What you will learn
- How to enable/disable column-level and row-level security
- How to make functions that can bypass such constraints
- How PostgreSQL defines row-level policies and how to enable it
What you need to know
- How to interact with a PostgreSQL (9.6 or higher) database
- How to create tables, users, and functions
- How to run stored procedures
PostgreSQL, as many other relational databases, provides support for GRANT and REVOKE commands used to provide privileges on objects (e.g., tables) to users. In PostgreSQL, it is possible to extend the privileges in a vertical way (i.e., at the column level) or in a horizontal way (i.e., at the row-level).
Enabling column and row level security allows for a better hardening of the data in the database itself, and prevents malicious or corrupted applications from accessing private data.
Here is a specific explanation to better understand the difference between the two directions security can move:
- column level security dictates that a user cannot access more columns in a result set than the ones granted by the privileges. It does not matter which filters (e.g., WHERE conditions) or sorting the user is going to use in her queries, it will not be able to extract data from the private columns.
- row level security enables a kind of virtual horizontal data split, and dictates that the user will not be able to handle any row outside specific criteria. This is applied before any query condition or filtering. In other words, it is like the whole result set has been shrunk down before it is actually evaluated for the user.
A Simple Example
To demonstrate all the above concepts, consider the following simple example: a table of products that contains information about who manages (e.g., sells, store, move) the products and about the price of each of them. It should be immediately clear that some users should be able to see the prices, while others should never be enabled to see the price, without any regard to the product itself (i.e., the tuple), and this is a perfect candidate for a column level security. On the other hand, users can only manage products that have been assigned to them, and this is a perfect candidate for a row-level security.
Having defined the problem, let’s see how the products table could be defined:
CREATE TABLE products ( pk integer GENERATED ALWAYS AS IDENTITY, id text, description text, managed_by text, price money DEFAULT 0, qty int DEFAULT 1, ts timestamp DEFAULT CURRENT_TIMESTAMP, UNIQUE ( id, managed_by ), CHECK ( qty >= 0 ), CHECK ( price >= 0::money ) );
The fields, managed_by and price, are those that will be attached to row-level and column-level security constraints respectively.
For the sake of keeping the example really small and simple, only three users are considered:
CREATE ROLE stock_app WITH LOGIN CONNECTION LIMIT 0; CREATE ROLE boss WITH LOGIN CONNECTION LIMIT 1 PASSWORD 'XXXX' IN ROLE stock_app; CREATE ROLE user_a WITH LOGIN CONNECTION LIMIT 1 PASSWORD 'XXXX' IN ROLE stock_app; CREATE ROLE user_b WITH LOGIN CONNECTION LIMIT 1 PASSWORD 'XXXX' IN ROLE stock_app;
The stoc_app is simply a group, where all users will belong to. This allows for managing some simple login rules, such as putting in pg_hba.conf something like:
host all +stocker_app 127.0.0.1/32 md5
that allows any user in the stocker_app group to connect to the database.
Populating the Table
To be able to test the security concepts, let’s populate the table with some random data. In this step, prices, quantities and assigned users will be randomly chosen.
INSERT INTO products( id, description, managed_by, qty, price ) SELECT pnum , 'Article nr.' || pnum , CASE WHEN pnum % 2 = 0 THEN 'user_a' ELSE 'user_b' END , abs( random() * 1234 )::int , abs( random() * 100 )::decimal::money FROM generate_series( 1, 1000 ) AS pnum;
The above query will populate the products table with one thousand random items. Half of such items will be assigned to user_a and the remaining half to the user_b.
Applying Security Constraints
It is now time to apply security constraints. The first step is removing all privileges automatically assigned to everybody from the table products. To reset all permissions (this is one of the best practices to keep the situation clear):
REVOKE ALL ON products FROM PUBLIC; GRANT ALL ON products TO boss;
So far, only the user boss can do everything on the table. Better, the table creator and the boss user will be the only one who can manage the table as a whole.
The second step is to apply column-level security. So, for instance consider that case where both user_a and user_b can see only the columns id, description and qty:
GRANT SELECT( id, description, qty ) ON products TO user_a; GRANT SELECT( id, description, qty ) ON products TO user_b;
After that, both user_a and user_b will not be able to issue a SELECT statement that includes more columns other than the ones defined above:
> SELECT current_user; current_user -------------- user_a > SELECT * FROM products; ERROR: permission denied for relation products
Now it is time to enable row-level security. In PostgreSQL, row-level security is enabled via a policy that defines the constraints on the table. Such constraints can be defined with WHERE-like conditions, and can be specified for both SELECT statements and INSERT, UPDATE ones.
The policy in this case will be:
CREATE POLICY products_policy ON products FOR ALL USING ( managed_by = CURRENT_USER ) WITH CHECK ( price IS NOT NULL AND managed_by = CURRENT_USER );
The products_policy attaches to ALL commands (SELECT, UPDATE, DELETE, INSERT) and imposes that data retrieval is performed only for those tuples where the condition ( managed_by = CURRENT_USER ) is true. The special variable CURRENT_USER, as shown before, is the username that is currently running the connection. The CHECK condition applies to data writes. Therefore, the products_policy means that the users will be able to see only rows there managed_by contains the respective username, and will not be able to write on rows where the price has not been defined and that belongs to other users.
So far the policy is still not active. To enable it, an ALTER TABLE has to be issued:
ALTER TABLE products ENABLE ROW LEVEL SECURITY;
The products_policy does have a problem: it is too restrictive and does not allow the user boss to handle any data at all since no rows satisfy the condition managed_by = ‘boss’. Since PostgreSQL allows several policies to be attached to a table, it is possible to define a specific policy for the user boss as follows:
CREATE POLICY products_boss_policy ON products FOR ALL USING ( CURRENT_USER = 'boss' ) WITH CHECK ( CURRENT_USER = 'boss' );
There is no need to re-enable the ROW LEVEL SECURITY since it has already been enabled, and it dynamically applies all the defined policies.
Now, in order to check what the two policies mean, it is possible to count how many rows every user can get:
-- as user 'boss' > SELECT CURRENT_USER, count(id) FROM products; current_user | count --------------+------- boss | 1000 -- as 'user_a' > SELECT CURRENT_USER, count(id) FROM products; current_user | count --------------+------- user_a | 500 -- as 'user_b' > SELECT CURRENT_USER, count(id) FROM products; current_user | count --------------+------- user_b | 500
Thus, the boss user can see every row while other users see a restricted data-set.
How are policies applied?
The EXPLAIN command reveals how a query is executed against the table:
> EXPLAIN (FORMAT TEXT) SELECT CURRENT_USER, count(id) FROM products; QUERY PLAN | Aggregate (cost=34.76..34.77 rows=1 width=72) QUERY PLAN | -> Seq Scan on products (cost=0.00..33.50 rows=503 width=3) QUERY PLAN | Filter: ((managed_by = (CURRENT_USER)::text) OR (CURRENT_USER = 'boss'::name))
The explain output reveals that the row level security is applied at the very bottom, i.e., at the very beginning of the query, as a condition filter. It is interesting to note that the two policies have been tied by a logical OR.
However, as already stated, policies are not applied in the strict sense of WHERE filters, and in fact, if a user tries to circumvent the filter, even in the EXPLAIN, he or she gets an error:
-- as 'user_a' > SELECT CURRENT_USER, count(id) FROM products WHERE managed_by = 'user_b'; ERROR: permission denied for relation products
Escaping the policies (in the good way)
What if each user needs to get some sort of statistical or aggregate information even on the columns and rows he or she cannot access? The solution is to provide one (or more) stored procedures with a higher execution privilege, something like the suid for Unix executables.
So, in case there is the need to provide the totals to every user, it is possible to define a function as follows:
CREATE TYPE p_stats AS ( tot_qty int, tot_count int, tot_price money ); CREATE FUNCTION f_products_stats() RETURNS p_stats AS $BODY$ DECLARE tot p_stats%rowtype; BEGIN SELECT sum( qty ) AS q , count(id) AS c , sum( price ) AS s INTO STRICT tot FROM products; RETURN tot; END $BODY$ LANGUAGE plpgsql SECURITY DEFINER;
The f_products_stats simply computes some aggregated information and returns it as a record of type p_stats, something that is definitely a row and can be used in queries. The point is that such a function needs, obviously, to access the whole products table without any regard to who is actually running the function, and that’s the special SECURITY DEFINER privilege. In particular, SECURITY DEFINER means that the stored procedure will be executed with all privileges of the user that has defined the function itself (so in this sense it is similar to suid on Unix executables), as opposed to the default behaviour of SECURITY INVOKER that will run the procedure with privileges of any current user.
This allows any unprivileged user to get total information even if she cannot access every single piece of data:
-- as 'user_a' > SELECT CURRENT_USER, * FROM f_products_stats(); current_user | tot_qty | tot_count | tot_price --------------+---------+-----------+------------ user_a | 621687 | 1000 | $50,764.81 -- as 'boss' > SELECT CURRENT_USER, * FROM f_products_stats(); current_user | tot_qty | tot_count | tot_price --------------+---------+-----------+------------ boss | 621687 | 1000 | $50,764.81
The totals are the same, without any regard of the specific user that is running the function and, therefore, the query within it.
PostgreSQL offers a complex and fine grained set of GRANT/REVOKE, as well as row level policies to control who can access and manage every single piece of data. Thanks to the privileges that can be granted to stored procedures, this will not prevent users from querying data that is prohibited during normal operations, therefore allowing developers to implement a quite complex and tuned interface for data management.
About Luca Ferrari
Luca Ferrari lives in Italy with his beautiful wife, his great son, and two female cats.
Computer science passionate since the Commodore 64 age, he holds a master degree and a Ph.D. in Computer Science. He is a PostgreSQL enthusiast, a Perl lover, an Operating System passionate, a UNIX fan and performs as many tasks as possible within Emacs. He considers Open Source the only truly sane way of doing software and services.
His website is available at http://fluca1978.github.io
- PostgreSQL website: http://www.postgresql.org
- PostgreSQL Documentation https://www.postgresql.org/docs/
- Example code in this article: https://github.com/fluca1978/fluca1978-pg-utils/blob/master/examples/simple_security_example.sql