DBSI server manual
DNA-Binding Site Identifier (DBSI) is a structure-based method for identifying DNA-binding sites on proteins that are known or believed to bind DNA. DBSI utilizes carefully designed and selected features that encapsulate various physical, chemical and geometric properties of the protein surface that enable DNA recognition. The model uses information from the three dimensional structure of a protein to calculate several features pertaining to aspects of residue characteristics, secondary structure, solvent accessibility, electrostatic potential, atomic density and surface geometry. Special emphasis was placed on capturing local and non-local co-operative effects with the design of features that took into account the residue micro-environment, and features that captured longer-range co-operativity. This model uses Support Vector Machines (SVM) as a learning strategy for the classification of surface residues as binding or non-binding with respect to DNA. DBSI has been trained, validated and tested on several data sets consisting of high-resolution structures of Protein-DNA complexes, and is among the most accurate methods for the purpose of identifying DNA binding sites.
The DBSI model was developed by optimizing the feature combination and training parameters using an iterative forward selection approach. In addition, by studying proteins with both bound and unbound structures, we demonstrate that DBSI can predict DNA-binding sites starting from the unbound structure with similar accuracy to predictions made using the bound protein structure. This is a significant observation, as prediction on unbound structures is the expected starting point in practical applications.
- Submitting a job
- Viewing job results
- Registration and login
- Browser requirements
The DBSI model has been trained on a data-set containing 262 crystal structures of Protein-DNA complexes, extensively cross-validated and tested on independent datasets that contained 206, 29 and 30 structures. Various performance metrics were computed for each of these datasets. DBSI has also been compared with several other sequence and structure-based methods that perform the function of identifying DNA binding sites. Further information regarding features, model optimization, performance metrics and model comparisons can be found here.
The DBSI server is a web-based interface to run DBSI, and visualize DBSI predictions overlaid on the protein structure. The DBSI server uses several third-party software packages and in-house programs to analyze and predict DNA binding sites on submitted structures. The webserver allows automation of these calculations and requires the submission of only the tarball obtained from the CHARMM-GUI (Instructions here) to run DBSI for the protein structure of your choice.
The citations given below provide a complete discussion of the development and performance of the DBSI
S. Sukumar, X. Zhu, and J. C. Mitchell
If you use the DBSI server in your work, please cite these papers:
X. Zhu, S. E. Ericksen, and J. C. Mitchell
DBSI: DNA Binding Site Identifier
Nucleic Acids Research, 41(16): e160, 2013.
The DBSI Server: Predicting DNA Binding Sites
DBSI is a structural model, and requires a 3-D structure of the protein for which DNA binding site predictions are to be made. Electrostatics features are integral to DBSI's calculations. These are derived from an electrostatic map for the structure that can be generated by solving the Poisson-Boltzmann equation using the CHARMM-GUI here.
Some caveats to note:
- Some of the third-party software that DBSI uses is incompatible with engineered residues such as MSE or different Histidine protonation states. We advise you to rename all Amino acids in your PDB file to standard nomenclature while submitting your protein to CHARMM-GUI.
- CHARMM parameterizes structures using segments rather than chains, and errors could arise due to discrepancies between the number of chains and the number of segids in CHARMM. Chain breaks could cause differences here and will throw an error. This can be fixed by removing internal TER cards that occur at strand breaks within each chain before submitting the structure to CHARMM-GUI
In this example, we analyze only 2 protein chains (chains A and B) that constitute a dimer.
The next prompt allows us to specify some parameters to solve the Poisson-Boltzmann equation. Please note that modifications are necessary to some parameters. As highlighted in the figure, epsP is changed to 2.0, Dcel_c is changed to 1.0 and Dcel_f is changed to 0.5.
Generating the Electrostatics files can take some time. Once this process is complete, you may download the .tgz file by clicking the Download .tgz button. This is the file that serves as input to the DBSI server.
Once a .tgz file is generated by CHARMM-GUI, this file can be uploaded on to DBSI to predict DNA binding sites. Open the Submit Job page and enter your email ID (optional) and upload the tgz file from CHARMM-GUI. A job name can be specified. By default, job privacy ensures that the results from the job queue are only readable by you. Also provided on this screen are options to download a sample PBEQ archive and a link that automatically loads test data to submit to the DBSI server.
When the upload is successful, a job directory is created on the server and can the status of the job can be viewed on the job queue. A link with the job status is provided, where results will be displayed upon job completion.
Clicking View results takes you to the Jmol visualization of the results where the DBSI scores are overlaid onto the structure. Clicking the JobID takes you to the results summary page (see example).
- View results takes you to the Jmol visualization of the results
- PDB file with analysis downloads the PDB file that has DBSI scores written into the B-factor column.
- DBSI analysis downloads a text file that gives DBSI scores for each individual surface residue, and a binary classification of whether it is predicted to be a binder or non-binder
The JMol visualization has a control panel as well as options to change background, display styles and different coloring schemes. Residue level display options are also implemented. Scores for individual residues and surface patches can be seen as chains of residues. To load an example, click here
These controls alter the appearance of the selected atoms. By default, DBSI selects all protein atoms in the complex. Advanced users may change the atom selection by using the Jmol scripting language.
Additionally, users can save up to four different views of their session.
- Background : Change the color of the background
- Style : Change the representation of selected backbone or side-chain atoms
- Color : Change the color of the selected backbone or side-chain atoms
Surface and DNA-Binding Residues
- Save : Record the current state of the display
- View Restore the viewer to the saved state
Each chain produces a unique group in the interface display.
The checkboxes in each cell control the display of an interface residue.
- Show Chain : The checkbox by the chain name toggles whether the chain is displayed or hidden.
- Selection : The popup menu determines which subset of atoms is selected for action by the Display Controls (See Caveat).
The coloring within each cell also encodes information about the residue.
- Checkbox #1 : Highlight the residue with space filling
- Checkbox #2 : Add a translucent surface around the residue
- Background color : Chemical type (gray = hydrophobic, yellow = polar, red = acidic, blue = basic, purple = HIS)
- Highlight color : Classification (pink = predicted binder, white = surface residue)
Caveat: If you use the console to make selections and change displays, the selections shown in the Control Panel may no longer be accurate. Actions taken using the console override any mouse-driven selection and display controls.
- Open Console : Opens the console, uses Rasmol commands (See Caveat)
- PDB File : Opens the PDB file used by the molecular viewer
- Jmol Help : Opens the documentation for Jmol
- DBSI Help : Opens the DBSI Server instruction manual (this document)
The DBSI PDB file has the DBSI results loaded into the B-Factor column, for viewing in molecular viewers such as PyMol. To view the results in Pymol, download the dbsi.pdb file and open it with Pymol. Under the C dialog box, select spectrum and color by B-factor. This will give you a scaled coloring scheme with red being predicted DNA binders and blue being predicted non-binders.
The results text file has several columns. Column 1 to 3 identify the reside. Column 4 is a validation tag that is not relevant to server-submitted jobs. Column 5 has the DBSI prediction scores (A positive value indicates that the amino acid is a predicted binder). The last column contains a + if the residue is predicted to bind DNA and a - if not, which is useful for identifying stretches of predicted DNA-binding residues.
In case an error occured while running DBSI, the job queue reports a status with an Error. Clicking on View Error displays a log file that explains the nature of the error.
Errors we have thusfar encountered include: nonstandard residues (especially MSE); chain breaks in which TER resords are present in the middle of a chain; and uploading a file that is not a valid .tgz from CHARMM-GUI.
If you have ruled out these possibilities and are unable to understand the reason for your error, please email us with your job number, so that we may look into it and update the instructions if needed.
This website is free and open to all users and there is no login requirement. Users can register prior to submitting jobs to any of the tools hosted by the Mitchell Lab or submit jobs anonymously. Personal information is only used to contact users when their analysis is complete; it will not be shared. To register, enter a unique user name and email address on the registration page, then click the submit button. An error message will display if the selected user name is in use by another user. User registration is not required to run DBSI server jobs.
Once registered, users may log in to the server. Although login is not required to submit jobs, it allows a user to view their personal jobs in the job viewer. Both the username and password are case sensitive. By default, a login will expire after two weeks; however, a user may manually logoff as well.
The DBSI server has been tested on Windows and Mac machines, as well as popular browsers such as Google Chrome, Mozilla Firefox and Safari.
For visualizing DBSI results in the browser, the Mitchell Lab website uses Jmol, which requires either Java or HTML5. Jmol is extensively documented, so we direct users to the following websites for information about its use.
Please use the Java control panel (accessible from the Settings app on a Mac or the control Panel on Windows) and under the Security tab, please ensure that the Enable Java content in browser check-box is checked and an exception is added for https://mitchell-lab.org under the Edit site list option.
Some Macs do not support Java 7 on Google chrome, and instructions to revert to the Apple original Java distribution can be found
here. Note that DBSI server will work if Java is disabled on your browser, but the Jmol visualization will not work. Results can still be downloaded and viewed on your own computer. (Instructions here)