web2py: Normalized many-to-many model with multiselect drop-down
The Python framework web2py is pretty powerful when it comes to automagical form and data handling. A quite common use case in an application is the need to manage a business object that might have many relations to some other business object, e.g. one package might consist of many components. If you use one database table for each object there are at least two approaches to create a many-to-many relation in web2py. ## Approach One (simple): list:reference
Let us have a look at this DAL (Database Abstraction Layer) model:
db.define_table(’component’,
Field(’a_component_field’),
format=’%(a_component_field)s’)
db.define_table(’package_one’,
Field(’a_package_field’),
Field(’component_ids’, ‘list:reference component’)
format=’%(a_package_field)s’)
When you open the appadmin
database administration section of the application, the form to add a new package should look like this, given there are some components already:
You just have to use list:reference component
and you will get a multiselect drop-down.
The list:reference
is described in the web2py manual as:
The
list:reference
and thecontains(value)
operator are particularly useful to de-normalize many-to-many relations. [...]Notice that a
list:reference
tag field get a default constraintrequires = IS_IN_DB(db,’tag.id’,db.tag._format,multiple=True)
that produces a SELECT/OPTION multiple drop-box in forms.
Why might that approach be problematic in a relational database?
list:reference
stores the referenced IDs in the corresponding field as a string, concatenated by a vertical bar (or pipe). This is a classical SQL antipattern (book recommendation) for various reasons, some of them are:
- You cannot ensure uniqueness.
- It violates the first normal form (1NF).
- It is not possible to delete values without reading all values, parsing them and storing all but the obsolete values again.
- Counting is difficult.
- Query performance might be an issue when the table contains lots of these values.
- A string field of an SQL table has a limited size depending on its data type. If not chosen properly, this will literally cut references.
- The relation between the objects cannot have specific properties.
Some issues are fortunately treated by the DAL, transparent to the developer, some cannot be solved that way. There are also times where you might want to query the database directly using plain SQL or some other query tool (Adminer, PhpMyAdmin, DB Manager in your IDE) which requires a consistent database structure independent from DAL.
I recommend to carefully think about the entity relationship model and try to estimate the likely amount of data and the type of data you want to store with this approach. web2py also offers other list
types like list:string
and list:integer
that might be used for a small amount of denormalized data.
A clear advantage of this approach: It takes almost no time to implement and is therefore definitely useful for quick prototypes or small applications.
Approach two (failed): Intersection table and IS_IN_DB
requirement
The next option naturally comes in mind: Create an intersection table and use the IS_IN_DB
validator to create a multiselect drop-down. Unfortunately, this approach fails and causes an exception when the form is processed:
int() argument must be a string or a number, not ‘list’
Model:
db.define_table(’component’,
Field(’a_component_field’),
format=’%(a_component_field)s’)
db.define_table(’package’,
Field(’a_package_field’),
format=’%(a_package_field)s’)
db.define_table(’component_package_association’,
Field(’component_id’, ‘reference component’),
Field(’package_id’, ‘reference package’))
db.component_package_association.component_id.requires = IS_IN_DB(
db, ‘component.id’, ‘%(a_component_field)s’, multiple=True)
Controller:
def manage_packages():
if request.args(0) == ‘new’ and request.args(1) == ‘package’:
form = SQLFORM.factory(db.package, db.component_package_association)
if form.process().accepted:
# Code to store the new package and associations to components
# ...
response.flash = ‘Package has been created successfully.’
content = form
else:
content = SQLFORM.grid(db.package)
return dict(content=content)
web2py’s DAL does not know how to validate component_package_association.component_id
when a list of IDs is supplied. I could not find a way to get it working.
Approach three (recommended): Intersection table with custom insert handling
This way takes into account the problems mentioned above while retaining the view and behaviour for the user. Caveat: The admin will not have a multiple drop-down in the appadmin
database administration section since a modified controller is needed.
Model:
db.define_table(’component’,
Field(’a_component_field’),
format=’%(a_component_field)s’)
db.define_table(’package’,
Field(’a_package_field’),
format=’%(a_package_field)s’)
db.define_table(’component_package_association’,
Field(’component_id’, ‘reference component’),
Field(’package_id’, ‘reference package’))
Controller:
def manage_packages():
if request.args(0) == ‘new’ and request.args(1) == ‘package’:
# (1) Get available components
components = [(r.id, r.a_component_field) for r in db(db.component).select()]
# (2) Build the form
form = SQLFORM.factory(
db.package,
Field(
‘components’,
requires=IS_IN_SET(components, multiple=True)
)
)
# (3) Validate form data
if form.process().accepted:
# (4) Insert package
package_id = db.package.insert(
**db.package._filter_fields(form.vars))
if package_id and form.vars.components:
# (5) Insert component package associations
for component_id in form.vars.components:
existing_component = db.component(component_id)
if existing_component:
db.component_package_association.insert(
package_id=package_id,
component_id=existing_component
)
response.flash = ‘Package has been created successfully.’
content = form
else:
content = SQLFORM.grid(db.package)
return dict(content=content)
This sample controller can properly add new packages and list packages:
- It queries available components and builds a list containing sets with component primary keys and a description field.
- It builds a custom field that is treated as a set. The
IS_IN_SET
validator creates the same multiselect drop-down as in the previous approaches, it is predestinated to be applied to lists containing sets and treats the ids and labels properly. - The form data is validated, the
IS_IN_SET
is truthy. - The package is inserted and the id is returned
- Each
component_id
is stored together with thepackage_id
as a new association
Please note: The edit function of the SQLFORM.grid
will use the default form to edit one element of a table. The code for adding packages should provide an idea of how to implement the same functionality for editing entries. In a real application additional checks and error handling are required.
If you find issues with my approach or know how to improve it further, please drop a comment.

About the author
Jan Beilicke is a long-time IT professional and full-time nerd. Open source enthusiast, advocating security and privacy. Sees the cloud as other people's computers. Find him on Mastodon.