DBSI server manual

Contents

  1. Introduction
  2. Submitting a job
  3. Viewing job results
  4. Registration and login
  5. Browser requirements

Introduction

The DBSI model

DNA-Binding Site Identifier (DBSI) is a structure-based method for identifying DNA-binding sites on proteins that are known or believed to bind DNA. DBSI utilizes carefully designed and selected features that encapsulate various physical, chemical and geometric properties of the protein surface that enable DNA recognition. The model uses information from the three dimensional structure of a protein to calculate several features pertaining to aspects of residue characteristics, secondary structure, solvent accessibility, electrostatic potential, atomic density and surface geometry. Special emphasis was placed on capturing local and non-local co-operative effects with the design of features that took into account the residue micro-environment, and features that captured longer-range co-operativity. This model uses Support Vector Machines (SVM) as a learning strategy for the classification of surface residues as binding or non-binding with respect to DNA. DBSI has been trained, validated and tested on several data sets consisting of high-resolution structures of Protein-DNA complexes, and is among the most accurate methods for the purpose of identifying DNA binding sites. The DBSI model was developed by optimizing the feature combination and training parameters using an iterative forward selection approach. In addition, by studying proteins with both bound and unbound structures, we demonstrate that DBSI can predict DNA-binding sites starting from the unbound structure with similar accuracy to predictions made using the bound protein structure. This is a significant observation, as prediction on unbound structures is the expected starting point in practical applications.

Publications - citing DBSI

The DBSI model has been trained on a data-set containing 262 crystal structures of Protein-DNA complexes, extensively cross-validated and tested on independent datasets that contained 206, 29 and 30 structures. Various performance metrics were computed for each of these datasets. DBSI has also been compared with several other sequence and structure-based methods that perform the function of identifying DNA binding sites. Further information regarding features, model optimization, performance metrics and model comparisons can be found here.

The DBSI server is a web-based interface to run DBSI, and visualize DBSI predictions overlaid on the protein structure. The DBSI server uses several third-party software packages and in-house programs to analyze and predict DNA binding sites on submitted structures. The webserver allows automation of these calculations and requires the submission of only the tarball obtained from the CHARMM-GUI (Instructions here) to run DBSI for the protein structure of your choice.

The citations given below provide a complete discussion of the development and performance of the DBSI
If you use the DBSI server in your work, please cite these papers:

X. Zhu, S. E. Ericksen, and J. C. Mitchell
DBSI: DNA Binding Site Identifier
Nucleic Acids Research, 41(16): e160, 2013.
View Abstract

S. Sukumar, X. Zhu, and J. C. Mitchell
The DBSI Server: Predicting DNA Binding Sites
TBD, TBD.
View Abstract


Submitting a job

Using the CHARMM-GUI to generate .tgz file

DBSI is a structural model, and requires a 3-D structure of the protein for which DNA binding site predictions are to be made. Electrostatics features are integral to DBSI's calculations. These are derived from an electrostatic map for the structure that can be generated by solving the Poisson-Boltzmann equation using the CHARMM-GUI here.

Some caveats to note:

In this example, we analyze only 2 protein chains (chains A and B) that constitute a dimer.



The next prompt allows us to specify some parameters to solve the Poisson-Boltzmann equation. Please note that modifications are necessary to some parameters. As highlighted in the figure, epsP is changed to 2.0, Dcel_c is changed to 1.0 and Dcel_f is changed to 0.5.


Generating the Electrostatics files can take some time. Once this process is complete, you may download the .tgz file by clicking the Download .tgz button. This is the file that serves as input to the DBSI server.


The Job Submission Form and Job queue

Once a .tgz file is generated by CHARMM-GUI, this file can be uploaded on to DBSI to predict DNA binding sites. Open the Submit Job page and enter your email ID (optional) and upload the tgz file from CHARMM-GUI. A job name can be specified. By default, job privacy ensures that the results from the job queue are only readable by you. Also provided on this screen are options to download a sample PBEQ archive and a link that automatically loads test data to submit to the DBSI server.
When the upload is successful, a job directory is created on the server and can the status of the job can be viewed on the job queue. A link with the job status is provided, where results will be displayed upon job completion.

Viewing job results

Clicking View results takes you to the Jmol visualization of the results where the DBSI scores are overlaid onto the structure. Clicking the JobID takes you to the results summary page (see example).


Viewing job results using Jmol in the browser

The JMol visualization has a control panel as well as options to change background, display styles and different coloring schemes. Residue level display options are also implemented. Scores for individual residues and surface patches can be seen as chains of residues. To load an example, click here

Display Controls

These controls alter the appearance of the selected atoms. By default, DBSI selects all protein atoms in the complex. Advanced users may change the atom selection by using the Jmol scripting language.

Additionally, users can save up to four different views of their session. Surface and DNA-Binding Residues

Each chain produces a unique group in the interface display. The checkboxes in each cell control the display of an interface residue. The coloring within each cell also encodes information about the residue. Miscellaneous Buttons Caveat: If you use the console to make selections and change displays, the selections shown in the Control Panel may no longer be accurate. Actions taken using the console override any mouse-driven selection and display controls.

PDB file with analysis

The DBSI PDB file has the DBSI results loaded into the B-Factor column, for viewing in molecular viewers such as PyMol. To view the results in Pymol, download the dbsi.pdb file and open it with Pymol. Under the C dialog box, select spectrum and color by B-factor. This will give you a scaled coloring scheme with red being predicted DNA binders and blue being predicted non-binders.



Results text file

The results text file has several columns. Column 1 to 3 identify the reside. Column 4 is a validation tag that is not relevant to server-submitted jobs. Column 5 has the DBSI prediction scores (A positive value indicates that the amino acid is a predicted binder). The last column contains a + if the residue is predicted to bind DNA and a - if not, which is useful for identifying stretches of predicted DNA-binding residues.


Error Messages

In case an error occured while running DBSI, the job queue reports a status with an Error. Clicking on View Error displays a log file that explains the nature of the error.

Errors we have thusfar encountered include: nonstandard residues (especially MSE); chain breaks in which TER resords are present in the middle of a chain; and uploading a file that is not a valid .tgz from CHARMM-GUI.

If you have ruled out these possibilities and are unable to understand the reason for your error, please email us with your job number, so that we may look into it and update the instructions if needed.

Registration and login

This website is free and open to all users and there is no login requirement. Users can register prior to submitting jobs to any of the tools hosted by the Mitchell Lab or submit jobs anonymously. Personal information is only used to contact users when their analysis is complete; it will not be shared. To register, enter a unique user name and email address on the registration page, then click the submit button. An error message will display if the selected user name is in use by another user. User registration is not required to run DBSI server jobs. Once registered, users may log in to the server. Although login is not required to submit jobs, it allows a user to view their personal jobs in the job viewer. Both the username and password are case sensitive. By default, a login will expire after two weeks; however, a user may manually logoff as well.

System/ Browser requirements

The DBSI server has been tested on Windows and Mac machines, as well as popular browsers such as Google Chrome, Mozilla Firefox and Safari. For visualizing DBSI results in the browser, the Mitchell Lab website uses Jmol, which requires either Java or HTML5. Jmol is extensively documented, so we direct users to the following websites for information about its use. If your browser does not support HTML5, please ensure that Java is up to date on your system. We suggest navigating our site with a JavaScript-enabled browser. Also, please be sure you have the Sun Java Engine 1.4 or later installed. Visit java.com to download the latest version. Please use the Java control panel (accessible from the Settings app on a Mac or the control Panel on Windows) and under the Security tab, please ensure that the Enable Java content in browser check-box is checked and an exception is added for https://mitchell-lab.org under the Edit site list option.



Some Macs do not support Java 7 on Google chrome, and instructions to revert to the Apple original Java distribution can be found here. Note that DBSI server will work if Java is disabled on your browser, but the Jmol visualization will not work. Results can still be downloaded and viewed on your own computer. (Instructions here)