The second method of database creation is to do it off-line, using the index generation tools. This method is best if you have many thousands of entries to create, which would take an unacceptably long time using the LDAP method, or if you want to ensure the database is not accessed while it is being created.
suffix <dn>
As described in the preceding section, this option says what entries are to be held by this database. You should set this to the DN of the root of the subtree you are trying to create. For example
suffix "o=University of Michigan, c=US"
You should be sure to specify a directory where the index files should be created:
directory <directory>
For example:
directory /usr/local/umich-slapd
You need to make it so you can connect to slapd as somebody with permission to add entries. This is done through the following two options in the database definition:
rootdn <dn>
rootpw <passwd>
These options specify a DN and password that can be used to authenticate as the "superuser" entry of the database (i.e., the entry allowed to do anything). The DN and password specified here will always work, regardless of whether the entry named actually exists or has the password given. This solves the chicken-and-egg problem of how to authenticate and add entries before any entries yet exist.
Finally, you should make sure that the database definition contains the index definitions you want:
index {<attrlist> | default} [pres,eq,approx,sub,none]
For example, to index the cn, sn, uid and objectclass attributes the following index configuration lines could be used.
index cn,sn,uid
index objectclass pres,eq
index default none
See Section 4 on the configuration file for more details on this option. Once you have configured things to your liking, start up slapd, connect with your LDAP client, and start adding entries. For example, to add a the U of M entry followed by a Postmaster entry using the ldapadd tool, you could create a file called /tmp/newentry with the contents:
o=University of Michigan, c=USand then use a command like this to actually create the entry:
objectClass=organization
o=University of Michigan
description=University of Michigan at Ann Arbor
cn=Postmaster, o=University of Michigan, c=US
objectClass=organizationalRole
cn=Postmaster
description=U of M postmaster - postmaster@umich.edu
ldapadd -f /tmp/newentry -D "cn=Manager, o=University of Michigan, c=US" -w secretThe above command assumes that you have set rootdn to "cn=Manager, o=University of Michigan, c=US" and rootpw to "secret".
suffix <dn>
As described in the preceding section, this option says what entries are to be held by this database. You should set this to the DN of the root of the subtree you are trying to create. For example
suffix "o=University of Michigan, c=US"
You should be sure to specify a directory where the index files should be created:
directory <directory>
For example:
directory /usr/local/umich-slapd
Next, you probably want to increase the size of the in-core cache used by each open index file. For best performance during index creation, the entire index should fit in memory. If your data is too big for this, or your memory too small, you can still make it pretty big and let the paging system do the work. This size is set with the following option:
dbcachesize <integer>
For example:
dbcachesize 50000000
This would create a cache 50 MB big, which is pretty big (at U-M, our database has about 125K entries, and our biggest index file is about 45 MB). Experiment with this number a bit, and the degree of parallelism (explained below), to see what works best for your system. Remember to turn this number back down once your index files are created and before you run slapd.
Finally, you need to specify which indexes you want to build. This is done by one or more index options.
index {<attrlist> | default} [pres,eq,approx,sub,none]
For example:
index cn,sn,uid pres,eq,approx
index default none
This would create presence, equality and approximate indexes for the cn, sn, and uid attributes, and no indexes for any other attributes. See the configuration file section for more information on this option.
ldif2ldbm -i <inputfile> -f <slapdconfigfile>
[-d
<debuglevel>] [-j <integer>]
[-n <databasenumber>] [-e
<etcdir>]
The arguments have the following meanings:
-i <inputfile>
Specifies the LDIF input file containing the entries to add in text form (described below in Section 8.3).
-f <slapdconfigfile>
Specifies the slapd configuration file that tells where to create the indexes, what indexes to create, etc.
-d <debuglevel>
Turn on debugging, as specified by <debuglevel>. The debug levels are the same as for slapd (see Section 6.1).
-j <integer>
An optional argument that specifies that at most <integer> processes should be started in parallel when building the indexes. The default is 1. If set to a value greater than one, ldif2ldbm will create at most that many subprocesses at a time when building the indexes. A separate subprocess is created to build each attribute index. Running these processes in parallel can speed things up greatly, but beware of creating too many processes, all competing for memory and disk resources.
-n <databasenumber>
An optional argument that specifies the configuration file database for which to build indices. The first database listed is "1", the second "2", etc. By default, the first ldbm database in the configuration file is used.
-e <etcdir>
An optional argument that specifies the directory where ldif2ldbm can find the other database conversion tools it needs to execute (ldif2index and friends). The default is the installation ETCDIR.
The next sections describe the programs invoked by ldif2ldbm when it is building indexes. Normally, these programs are invoked for you, but occasionally you may want to invoke them yourself.
ldif2index -i <inputfile> -f <slapdconfigfile>
[-d
<debuglevel>] [-n <databasenumber>] <attr>
Where the -i, -f, -d, and -n options are the same as for the ldif2ldbm program. <attr> is the attribute to build an index for. Which indexes are built (e.g., equality, substring, etc.) is controlled by the corresponding index line in the slapd configuration file.
You can use the ldbmcat program to create a suitable LDIF input file from an existing LDBM database.
ldif2id2entry -i <inputfile> -f <slapdconfigfile>
[-d
<debuglevel>] [-n <databasenumber>]
The arguments are the same as for the ldif2ldbm program.
ldif2id2children -i <inputfile> -f <slapdconfigfile>
[-d
<debuglevel>] [-n <databasenumber>]
The arguments are the same as for the ldif2ldbm program. You can use the ldbmcat program to create a suitable LDIF input file from an existing LDBM database.
ldbmcat [-n] <filename>
where <filename> is the name of the id2entry index file. The corresponding LDIF output is written to standard output.
The -n option can be used to prevent the printing of entry IDs in the LDIF format. If you are creating an LDIF format for use as input to ldif2index or anything by ldif2ldbm, you should not use the -n option (because the entry IDs must match those already in the id2entry file). If you are just making a backup of your data, you can use the -n option to save space.
ldif [-b] <attrname>
where <attrname> is the name of the attribute. Without the -b option, ldif considers each line of standard input to be a separate value of the attribute.
The -b option can be used to force ldif to interpret its input as a single raw binary value. This option is useful when converting binary data such as a jpegPhoto or audio attribute.
[<id>]where <id> is the optional entry ID (a positive decimal number). Normally, you would not supply the <id>, allowing the database creation tools to do that for you. The ldbmcat program, however, produces an LDIF format that includes <id> so that new indexes created will be consistent.
dn: <distinguished name>
<attrtype>: <attrvalue>
<attrtype>: <attrvalue>
...
A line may be continued by starting the next line with a single space or tab character. e.g.,
dn: cn=Barbara J Jensen, o=University of MichiMultiple attribute values are specified on separate lines. e.g.,
gan, c=US
cn: Barbara J JensenIf an <attrvalue> contains a non-printing character, or begins with a space or a colon `:', the <attrtype> is followed by a double colon and the value is encoded in base 64 notation. e.g., the value " begins with a space" would be encoded like this:
cn: Babs Jensen
cn:: IGJlZ2lucyB3aXRoIGEgc3BhY2U=Multiple entries within the same LDIF file are separated by blank lines. Here's an example of an LDIF file containing three entries.
dn: cn=Barbara J Jensen, o=University of Michi
gan, c=US
cn: Barbara J Jensen
cn: Babs Jensen
objectclass: person
sn: Jensen
dn: cn=Bjorn J Jensen, o=University of Michi
gan, c=US
cn: Bjorn J Jensen
cn: Bjorn Jensen
objectclass: person
sn: Jensen
dn: cn=Jennifer J Jensen, o=University of Michi
gan, c=US
cn: Jennifer J Jensen
cn: Jennifer Jensen
objectclass: person
sn: Jensen
jpegPhoto:: /9j/4AAQSkZJRgABAAAAAQABAAD/2wBDABALD
A4MChAODQ4SERATGCgaGBYWGDEjJR0oOjM9PDkzODdASFxOQ
ERXRTc4UG1RV19iZ2hnPk1xeXBkeFxlZ2P/2wBDARESEhgVG
...
Notice that the jpegPhoto in Jennifer Jensen's entry is encoded using base 64. The ldif program (described in Section 8.2.6) can be used to produce the LDIF format.
NOTE: Trailing spaces are not trimmed from values in an LDIF file. Nor are multiple internal spaces compressed. If you don't want them in your data, don't put them there.
edb2ldif [-d] [-v] [-r] [-o] [-b <basedn>]The LDIF data is written to standard output. The arguments have the following meanings:
[-a <addvalsfile>] [-f <fileattrdir>]
[-i <ignoreattr...>] [<edbfile...>]
-d
This option enables some debugging output on standard error.-v
Enable verbose mode that writes status information to standard error, such as which EDB file is being processed, how many entries have been converted so far, etc.-r
Recurse through child directories, processing all EDB files found.-o
Cause local .add file definitions to override the global addfile (see -a below)-b <basedn>
Specify the Distinguished Name that all EDB file entries appear below.-a <addvalsfile>
The LDIF information contained in this file will be appended to each entry.-f <fileattrdir>
Specify a single directory where all file-based attributes (typically sounds and images) can be found. If this option is not given, file attributes are assumed to be located in the same directory as the EDB file that refers to them.-i <ignoreattr>
Specify an attribute that should not be converted. You can include as many -i flags as necessary.<edbfile>
Specify a particular EDB file (or files) to read data from. By default, the EDB.root (if it exists) and EDB files in the current directory are used.When edb2ldif is invoked, it will also look for files named .add in the directories where EDB files are found and append the contents of the .add file to each entry. Typically, this feature is used to include inherited attribute values (e.g., objectClass) that do not appear in the EDB files.
2. If you do not have a file named EDB.root in the same directory that contains your organizational or organizational unit entry, create it now by hand. Its contents should look something like this:
MASTER
000001
o=University of Michigan
objectClass= top & organization & domainRelatedObject &\ quipuObject & quipuNonLeafObject
l= Ann Arbor, Michigan
st= Michigan
o= University of Michigan & UMICH & UM & U-M & U of M
description= The University of Michigan at Ann Arbor
associatedDomain= umich.edu
masterDSA= c=US@cn=Woolly Monkey
objectClass: person
edb2ldif -v -r -b "c=US" -i iattr -i acl -i xacl -i sacl
-i lacl -i masterDSA -i slaveDSA > ldif
5. Follow the steps outlined in section 8.2 above to produce an LDBM database from your new LDIF file.
ldbmtest [-d <debuglevel>] [-f <slapdconfigfile>]The default configuration file in the ETCDIR is used if you don't supply one. By default, ldbmtest operates on the last database listed in the config file. You can specify an alternate database, or see the current database with the following commands.
b specify an alternate backend databaseThe b command will prompt you for the suffix associated with the database you want. The database you select can be viewed and modified using a set of two-letter commands. The first letter selects the command function to perform. Possible commands and their meanings are as follows.
B print out the current backend database
l lookup (do not follow indirection)The second letter indicates which index the command applies to. The possible index selections are as follows.
L lookup (follow indirection)
t traverse and print keys and data
T traverse and print keys only
x delete an index item
e edit an index item
a add an index item
c create an index file
i insert an entry into an index item
c id2children indexEach command may require additional arguments which ldbmtest will prompt you for.
d dn2id index
e id2entry index
f arbitrary file name
i attribute index
To exit ldbmtest, type control-D or control-C.
Note that this is a very raw interface originally developed when testing the database format. It is provided and minimally documented here for interested parties, but it is not meant to be used by the inexperienced. See the next section for a brief description of the LDBM database format.
Using this simple scheme, many LDAP queries can be answered efficiently. For example, to answer a search for entries with a surname of "Jensen", slapd would first consult the surname attribute index, look up the value "Jensen" and retrieve the corresponding list of EIDs. Next, slapd would look up each EID in the id2entry index, retrieve the corresponding entry, convert it from text to LDAP format, and return it to the client.
The following sections give a very brief overview of each type of index and what it contains. For more detailed information see the paper "An X.500 and LDAP Database: Design and Implementation," available in postscript format from
ftp://terminator.rs.itd.umich.edu/ldap/papers/xldbm.ps
= equality keysKey values are also normalized (e.g., converted to upper case for case ignore attributes). So, for example, to look up the surname equality value in the example above using the ldbmtest program, you would look up the value "=JENSEN".
~ approximate equality keys
* substring equality keys
\ continuation keys
Substring indexes are maintained by generating all possible N-character substrings for a value (N is 3 by default). These substrings are then stored in the attribute index, prefixed by "*". Additional anchors of "^" and "$" are added at the beginning and end of words. So, for example the surname of Jensen would cause the following keys to be entered in the index: ^JE, JEN, ENS, NSE, SEN, EN$.
Approximate values are handled in a similar way, with phonetic codes being generated for each word in a value and then stored in the index, prefixed by "~".
Large blocks in the index are split into smaller ones. The smaller blocks are accessed through a level of indirection provided by the original block. They are stored in the index using the continuation key prefix of "\".
The dn2id index stores normalized DNs as keys. The data stored is the corresponding EID.
The id2children index stores EIDs as keys. The data stored is a list of EIDs, just as for the attribute indexes.
Send comments about this page to: ldap-support@umich.edu