1 EMAN2.21a Reconstruction Tutorial Using the Project Manager This tutorial was updated in May, It should not be used with versions of EMAN2 older tha...
1 EMAN2.12 Reconstruction Tutorial Using the Project Manager This tutorial was updated in October It should be used with EMAN2.12 or a nightly snapsho...
1 EMAN2.2 Reconstruction Tutorial Using the Project Manager This tutorial was updated in January It should not be used with versions of EMAN2 older th...
1 Progression Analysis of Disease PAD web tool for the paper *: Topology based data analysis identifies a subgroup of breast cancers with a unique mut...
1 IB Workflow System Tutorial The IB Workflow System is a one-stop shop for accessing and integrating software tools that are used in projects and rel...
1 2 Sample Project In this project, we will walk you through a quick example workflow. This example will demonstrate how you can enhance your images u...
1 Geocortex Workflow Tutorial Create the Search Schools Workflow July Latitude Geographics Group Ltd Wharf St, Victoria, BC V8W 1T7 Canada Tel: (250) ...
1 MAERIAL INERFACE RECONSRUCION Kathleen S. Bonnell 1 Mark A. Duchaineau 2 Daniel R. Schikore 3 Bernd Hamann 4 Kenneth I. Joy 5 Abstract his paper pre...
1 Tutorial 1: Leave Request Workflow creation In this tutorial, we are going to create the workflow shown above. It is a relatively simple business pr...
1 Tutorial - Designing a Nintex Workflow Start Form2 Contents Introduction... 3 Creating a site workflow... 4 Designing the Start Form of the Workflow...
EMAN2 Reconstruction Tutorial Using the Workflow interface Getting Started • Make sure you have the latest available version of EMAN2 installed. The workshop computers will have it preinstalled.
➡ EMAN2 documentation is largely provided via the Wiki at: http://blake.bcm.edu If you wish to edit the Wiki, create an account for yourself, then send email to [email protected] and we will adjust permissions so you can edit. Previously this was an ʻopenʼ wiki, but we had serious spam problems, and now have to individually approve new users. You donʼt need an account to browse the current contents, of course. ➡ GUI Tips: The interface in EMAN2 is quite similar to EMAN1. While there are alternatives for machines like Macs with the native 1-button mouse, EMAN2 will work much better if you have a 3button/scroll mouse. • In most display windows (plots, 2-D images and 3-D volume display), the middle mouse button will open a ʻcontrol panelʼ for the widget with many options to control the display • The right mouse button is used for panning in 2-D or 3-D image windows, and can be used to zoom (by shift+dragging), and to reset the zoom (clicking) in plot windows. • The left mouse button has various purposes in various contexts. • The scroll-wheel will generally act as a zoom. Use the control-panel for more precise control • If you have a one button mouse, one of the modifier keys (depending on platform) combined with a mouse click will serve the same role as a middle-click. You may need to try them (alt, command, ctrl, shift) to discover which works on your machine. • In the control panels, and other places in the EMAN2 interface you may encounter ʻValSlidersʼ. These are widgets where a slider is attached to a text-box with a number in it. Dragging the slider controls the number, and entering a number will change the slider. In addition, the text-box can be used to control the range of the slider and get more precise control. By typing ʻvalueʼ in the text box you can change the limits of the slider. Note that it is also possible to enter values which are outside the current slider range.
Introduction EMAN2 can be used at many different levels ranging from writing code in C++ or Python to a high-level integrated 3-D desktop interface. In this tutorial, we will be focusing primarily on the Workflow interface, which will guide you through the single particle reconstruction process, and keep track of appropriate intermediate data generated during the process in a self-consistent way. The workflow can be used as part of the integrated desktop (e2desktop.py) or as a standalone program (e2workflow.py). Unfortunately the desktop isnʼt quite perfected yet, and problems seem to be very machine dependent. On some machines it performs extremely well, but on other machines it behaves strangely. The advantage of the desktop is that it will save you a fair bit of desktop space, which is particularly nice on laptops. On the downside, it is fairly 3-D graphics intensive. So, for purposes of the workshop we will just use the workflow without the desktop. • Open e2workflow.py • On Windows • double click the appropriate icon on your desktop EMAN2 Tutorial
• in the ʻEMAN2 Tasksʼ window, go to the options under ʻUtilitiesʼ and choose the ʻWorking Directoryʼ task • Browse to the eman2_demo/raw_data directory and hit OK • On Mac and Linux • open a terminal window • cd to the eman2_demo/raw_data directory • type the name of the program ➡ If you did this step successfully you will have two windows : ʻRunning Tasksʼ and the other will be titled ʻEMAN2 Tasksʼ. If you hover your mouse over ʻDirectoryʼ, you should see a tooltip showing that you are in the eman2_demo/raw_data directory. The ʻRunning Tasksʼ window might be hidden behind the the workflow window.
Setup Project and Import Data The workflow interface is an expandable tree. Each level of the tree has a form with associated information or parameters, even the levels which just appear to be containers for the levels below them. The first item in the workflow is Single Particle Reconstruction. This item can be expanded into a list of the individual stages of the single particle reconstruction process, but it can also be selected itself to provide information about the overall reconstruction you are performing in the local directory. Note that the database used to store the information about your project resides in the local directory, so each project must reside in its own directory. The workflow will make a number of named subdirectories (folders) to store specific types of information. • Select Single Particle Reconstruction • This will open a window where you enter basic information about your project. The demo data for the workshop has the following parameters : 2.0 A/pix, 200 kV, Cs=1.0, mass=800kDa. • For a laptop you may want to use only 1 CPU, even if you have a dual core machine, to reduce heating, but it will make things run slower. • The workflow will attempt to detect how much RAM you have but this may or may not work on all platforms. This parameter isnʼt really used at the moment, but you may as well enter the correct value, as it will be used in future. • Click ʻOKʼ once you have entered the parameters. If you select a different workflow item without saying ʻOKʼ, your values will not be stored. • Expand Single Particle Reconstruction, causing 9 subtasks to appear: Raw Data, Particles, CTF, Particle Sets, Reference Free Class Averages, Initial Model, 3D Refinement, Resolution, and Eulers. ➡ In most steps you have a choice between using data already imported into the project, or importing the data directly. For example, if you wish to use particles already boxed out using another (nonEMAN) application, you can skip forward to the Particles step. However, the earlier in the process you import your data, the more flexibility you will have in recording relevant metadata about the processing youʼre doing. For example, if you import your CCD frames or micrographs, when particles are selected, EMAN2 will keep track of where each particle came from in each micrograph, offering many possibilities for future analyses that could be performed. This flexibility would be lost if you skip forward in the process. • Select the Raw Data task. You will see a form appear with an empty table. This form is a status display showing you all of the frames which have already been imported. You can add raw data to this form by hitting the ʻBrowse To Addʼ button or by right clicking and choosing ʻAddʼ. Note that right click menu also has ʻRemoveʼ and ʻSave Asʼ options. • Under the Raw Data task you will see Filter Raw Data. If you open this, a similar form to the one you just closed will be opened. Click on the ʼBrowse To Addʼ button. You should see the sample EMAN2 Tutorial
micrographs in the raw_data directory. Select them all by first clicking on image 1160.mrc, holding shift and then clicking on 1792.mrc (or just click and drag over your selections). This should highlight all images. Then hit OK. The browser dialog should close and the selected images should appear in the table on the first form. • In the Filter Raw Data form make sure that the ʻEdge normʼ and ʻThumbnailsʼ check boxes are selected and hit OK. Also, select the ʻAssociate with projectʼ option to have the filtered files automatically added to the project. ➡ The Edge norm option will adjust the images so the mean value around the edge of the frame is zero and the standard deviation of the pixel values is 1. This normalization helps regularize the data for later processing and minimizes problems with brightness and contrast, though additional perparticle normalization may occur later. ➡ Generating thumbnails will save time later when e2boxer is used for boxing. ➡ The invert option should be used if necessary to make your particles white on a dark background. Depending on acquisition method and whether your data is in ice or stain, you may or may not need this. For the workshop demo data, you do not need to invert. ➡ The X-ray pixel option will apply a filter to remove x-ray pixels from CCD frames. The demo data for the workshop is scanned film data, so this is not necessary. ➡ The ʻAssociate with projectʼ options adds the filtered images to the list of frames in the project - you can see list of the frames in the project by going back to the Raw Data task. ➡ The ʻInplace processingʼ option writes the filtered images over the input images, which saves on disk space, but you will use your original unprocessed images. • Check that the micrographs have been correctly imported into the project using the Raw Data task. This task will now display the images you just filtered, so long as you selected ʻAssociate with projectʼ. If you imported more images later, they would be added to this list. Double-clicking on one of the images in the list will display the image on the screen.
Particle data/boxing your data • Select the Particles task. You will see a form appear with an empty data list. This form is a status display showing you all of particles stacks currently associated with the project. You can add particle data to this form by hitting the ʻBrowse To Addʼ button or by right clicking and choosing ʻAddʼ. Note that right click menu also has ʻRemoveʼ and ʻSave Asʼ options. If you already have your particle data you can add it in this way and proceed directly to the CTF stage. ➡ Note that whenever possible we have designed the forms to the be flexible enough for you to add and remove data ʻon the flyʼ. For example, you could bypass the earlier Raw Data form and go straight to the Interactive Boxing and ʻAddʼ the raw data there. • Expand the Particles entry in the EMAN2 tasks window. You should see : Interactive Boxing, Auto Boxing, and Generate Output. • Go to the Interactive Boxing – e2boxer task and hit OK. • You should see the e2boxer interface form appear which displays a table. The left-most column lists the image names of the micrographs in the project. The other column, most probably blank at this point, is titled Stored Boxes. ➡ Note that you can double click on any of the image names in the left column and the associated image will appear in an interface for viewing (Try it). As a general rule of thumb, anything that has an icon is viewable by double-clicking. • If you have a decent workstation with a reasonable amount of RAM, you could open all of the micrographs at once, but on a laptop, this probably isnʼt wise. Choose 2 or 3 images (or just 1 if you have problems with more, implying your machine is too low-end), enter a boxsize of 128, then hit OK. • Wait a moment while e2boxer loads.
e2boxer – the GUI • You should see at least 3 windows appear. If you are using e2desktop.py, the windows will be positioned for you in specific locations. If you are running e2workflow.py you will need to arrange the windows yourself. On a laptop with a low-resolution display, this can be a challenge. The windows are: • The main controller - looks like a regular window; it has an assortment of buttons and text entry boxes • the main image display window - shows the full 2D micrograph currently selected • the particle display window - (with nothing in it) that will eventually show the boxed particles (this should be obvious from its icon). • micrograph thumbnails - This will only appear if you selected 2 or more micrographs in the previous step. Clicking on one of these will select the current image to be boxed. ➡ Note that you can get help in most of EMAN2ʼs interfaces by hitting F1. This will cause your web browser to open an EMAN2 wiki page displaying relevant information. Try hitting F1 in the 2D image display and Particle Stack display interfaces.
Getting started with interactive autoboxing In the main controller you should see a check box labeled ʻDynapixʼ, we recommend that you turn this on. If this option is selected, e2boxer you will autobox your images in real time as you interactively specify reference boxes (which are black). Turn Dynapix on then go to the main image display, find a particle and click on it. Try to center the particle in the box fairly well. You can move it after clicking by dragging it in either window. In the particle window you should see your newly selected box appear and, if e2boxer was able to do so successfully, more boxed particles should be displayed that were autoboxed automatically. Continue adding black reference boxes in the main image display until a reasonable proportion of the particles in the image have been selected. Try to avoid having too many false positives, though. ➡ If you put a black box on something that caused a lot of bad particles to be selected, hold shift down and left click on it to remove it – the automatic boxes will update automatically to reflect this change (providing Dynapix is on) ➡ Bad particles or boxes that contain just noise can be damaging to your reconstruction, as they permit noise/model bias to become stronger. It is much better to miss a few good particles if it permits you to exclude obvious ʻbadʼ particles. ➡ One caveat to the above, if the good particles that get excluded are all in one orientation, that would be a bad thing, of course ➡ If you just want to start again : Hit the clear button in the main controller – all reference (black) and auto (green) boxes will be removed and you can start again on this image. This functionality is particularly useful when you are just getting to know how e2boxer works. Also, clearing can be useful when you think the results of the auto boxing are bad. Try starting afresh but use different references. ➡ The ʻclassifyʼ button does not currently do anything useful. This is a planned future feature which will make it much easier to eliminate bad particles, but it isnʼt complete yet. There will likely be other automatic picking algorithms in future as well ➡ Note that this isnʼt your only opportunity to eliminate bad particles. Try not to be overly liberal, but you will have another chance to clean up any false positives after the Wiener filtration step. Building up a boxed particle set with e2boxer • Now that you are familiar with how the Dynapix option behaves, the idea is to select a set of references that results in the autoboxing of as many good particles as possible. Once you are satisfied the autoboxing results are as good as you are going to make them, you go through the EMAN2 Tutorial
particles in either window and manually delete bad particles or manually add missed good particles (without changing the autoboxing). ➡ To exclude large regions of the micrograph: Choose the erase option in the main controller and drag over the bad areas of your micrograph (e.g. ice contamination and particle aggregates). Unerasing will bring back particles that you accidentally removed, as long as you have not autoboxed after you erased. You change the radius of the erased region in the main controller or by holding down shift and moving the mouse wheel. ➡ To delete bad particles individually: Hold down shift and left click on the bad particles, either in the main image display OR in the particle image display. In the main image display you can also hold down shift and left click and drag the cursor around to delete many particles at one time. ➡ To add manual particles: Choose the manual button in the main controller and then click on the center of the particle you want to add manually. Manual boxes appear white and have the advantage of not changing the autoboxing results. • Once you have finished boxing the first image you can freeze it – this prevents e2boxer from autoboxing the image automatically in the future should you ever reopen or reselect the image in the e2boxer interface.
Boxing multiple images interactively ➡ Note that this section applies only if you have more than one image loaded into the e2boxer interface. • Choose the next image in the image thumbnails window. This should load the new image into the main display window, and if Dynapix is on you should also observe that autoboxing has occurred. • At this point you have several options: • you can add more reference boxes and see if the autoboxing results can be improved, • you can be satisfied with results and go straight to the cleaning stage • you can hit Clear in the main controller and start the dynamic autoboxing procedure from scratch for this (and subsequent) images ➡ If Dynapix is on and you have added a reference box and you are noticing that the autoboxing is running slowly it can be because you have specified reference boxes in multiple images. Try pressing clear and starting afresh in the current image. This should only be a problem on lower end machines. ➡ There are also more advanced options which give you finer control over the autoboxing process, but these should not be necessary for the demo. Just be aware that there are many more parameters if you arenʼt getting satisfactory results with the defaults.
Finishing e2boxer • Once you are done boxing the images you may be tempted to press the ʻgenerate outputʼ button, but it is more efficient to do this in the workflow interface instead once you have finished boxing all of the micrographs. When you have finished boxing each set of micrographs, simply click Done in the main controller and return to the e2workflow interface. We will deal with ʻgenerate outputʼ later. • If you have run e2boxer correctly you should be able to click on the Interactive boxing task and observe that the Stored Boxes column now has entries in it. These entries should correspond to the total number of boxes currently stored in the database for the given image. • At this point you have two options. To get familiar with the workflow you may try both approaches. 1.You can run automated boxing on the remaining micrographs. Go to the Autoboxing task (e2boxer) and choose images with no boxes stored in the database (column 2 should be blank) EMAN2 Tutorial
and then hit OK. This will tell the workflow to spawn processes on your operating system that complete the autoboxing procedure. You can monitor the progress of these processes in the Running Tasks window, which should display the percent of the task which is complete, the name of the program, and the time at which it was initiated. If the task doesnʼt appear in the Running tasks window immediately just wait a few moments. ➡ On lower end machines try autoboxing only one image at a time. (Windows users may experience some problems here, please let us know if you run into issues). If the process in the Running tasks list stops running (the percentage entry freezes at some value less than 100%) then you should definitely try just doing one image at a time. ➡ Even if you plan to manually clean up the autoboxing results, performing autoboxing in this way can save you the time you would normally spend waiting in the e2boxer interface for autoboxing to occur.
2. Repeat the process using the e2boxer GUI for the other micrographs ➡ For the purposes of the workshop before proceeding to the next step, you should have boxed out all of the micrographs in the project. • Generate boxed particle output by going to the Generate Output (e2boxer) task in the Particles section of the workflow. You should see a form appear that lists the micrographs you have in the project as well as the number of boxes you have stored for each image in the database. • Select all of the images • enter a boxsize of 144. • make sure the normalize.edgemean option is chosen • make sure the output image format is “bdb” • Hit Ok. Once again this will spawn processes on your operating system that you can monitor in the Running tasks window. ➡ On lower end machines , especially Windows machines, you may experience problems and be forced to run the output generating tasks one at time. • Check that your particle images exist and are stored in the database by going to the Particles task. This will display a table that tells you precisely which particles exist, how many there are and what dimensions they have. • If this table is complete and the dimensions of the particles are correct (144x144) you are ready to go the next stage in the workflow.
➡ Assuming you chose the ʻbdbʼ format your boxed particle output youʼll note that the filenames that appear in this list arenʼt simply “1160_ptcls”, but much longer and complicated looking ʻbdb:particles#1160_ptclsʼ. The files embodying this database are stored in the ʻparticles/ EMAN2DBʼ directory. The specifications for bdb database access are fairly straightforward: • ʻbdb:dbnameʼ which refers to the database (think of it as an image file) in the local directory named ʻdbnameʼ. ʻdbnameʼ can contain individual images, numbered sets of images, named images and other named metadata • ʻbdb:/path/to#dbnameʼ allows you to specify a database residing in a different directory. Note that ʻ#ʼ is used for the final separation between the path and the database name • files in the EMAN2DB directory should NEVER be renamed, individually deleted, moved, etc. It can cause the system to become ʻconfusedʼ and produce a range of seemingly unrelated errors.
• If you really insist on deleting files in the BDB directory, make sure you arenʼt running any EMAN2 programs, remove the files, then also remove /tmp/eman2db-username. Note that the easiest way to remove files from an EMAN2 database is using the EMAN2 browser.
CTF EMAN2 uses a substantially different CTF model than EMAN1. EMAN2 uses a data-based background curve. For this process to work properly, and to have phase-flipping work optimally, it is important that the box size used for the particles is somewhat larger than the actual particles. This is the reason for expanding the 128x128 boxes used for particle picking to the larger 144x144 box size. The edges of the images are used in determining the background noise spectrum. After determining the background automatic fitting of defocus and B-factor takes place. The uncertainty in B-factor tends to be fairly substantial, largely because it is only a very approximate representation of the true envelope function of the data coming out of modern FEG microscopes. ➡ EMAN1 used a complicated 10-parameter CTF model, and required a lot of painstaking manual fitting, with several complicated difficult to describe tasks related to determining a structure factor. Almost all of this work has been completely eliminated in EMAN2, but the models are not compatible. While EMAN2 can understand EMAN1 CTF parameters as well, if you want to work with EMAN2, you are much better off going back to non-phase-flipped particles and let EMAN2 deal with the CTF entirely internally. • Select the CTF task. This will display the list of particles in the project along with any determined CTF parameters (defocus, bfactor, etc) and/or any CTF related output (phase flipped or Wiener filtered data). As per usual, you can ʻBrowse To Addʼ particle stacks to this form, and remove unwanted data (right click menu). Note that the table displayed by the CTF task is the same as the one displayed by Particles task, except that extra columns have been added to display CTF related information. The CTF-related columns are filled in as you determine CTF parameters and write CTF output.
Determining CTF parameters and generating CTF/Wiener filtered particles • Expand the CTF entry in the workflow (the CTF under Single Particle Reconstruction, not the standalone CTF entry near the bottom of the workflow). You should see three entries appear underneath the CTF entry called: Automated Fitting , Interactive Tuning, and Generate Output • Run automated CTF determination on your particle data by choosing the Automating Fitting (e2ctf) task. • A form should appear with a table that lists the particles that you generated in the previous stage of the workflow. Other columns (which are probably blank at this point) are titled Particles On Disk, Particle Dims, Defocus, B factor, SNR and Sampling . • You must decide whether to check the ʻAuto high passʼ checkbox. If selected, this will modify the SNR curve to eliminate the first sharp peak that appears in virtually all single particle data. This sharp peak can cause some issues leading to incorrect 2-D alignment, and hence classification. However, it is also responsible for some of the contrast which makes particles visible by eye. To see the visual difference, you can run it first with then without this option checked if you like. For purposes of the rest of this tutorial, please check this box. • You should also decide on an oversampling factor. For purposes of the demo, 2 is a good value. If you have very far from focus images or small particles, larger values may produce more accurate CTF fitting. • To proceed, select all of the images, make sure the microscope voltage, spherical aberration and the angstrom per pixel parameters are correct, and hit OK. This will cause the workflow to spawn a set of jobs to complete automatic CTF determination. You can monitor the progress of the task(s) in the Running Tasks window. EMAN2 Tutorial
• When the fitting tasks are complete, click on the CTF task in the workflow. This should display a form that has been updated with automatically determined defocus, bfactor, ... parameters. ➡ Proceed to the next step when you certain that automatic CTF parameters have been generated for all particle data in the project. If necessary, you can rerun automatic fitting on one or more sets. • Select the Interactive Tuning (e2ctf) task. This will display a (by now) familiar looking table that lists particle file names and CTF data. Select all or a subset of the images and hit OK. This will launch the e2ctf interactive interface from which you can fine tune the automated fitting results. If you do alter the CTF parameters be sure to hit the Save parms button, which will store the changed parameters in the EMAN2 database. The interactive interface also gives you a variety of tools for assessing the quality of the individual images. ➡ Note that unlike in EMAN1, accurate per-micrograph B-factors are not critical for a good reconstruction or proper CTF correction (though it never hurts) • Select the Generate Output (e2ctf) task and in the subsequently appearing form select all of the images, make sure you select both Phase flip and Wiener, and use the same oversampling factor you used in the previous step, and hit OK. This will spawn output writing processes that you can monitor in the Running tasks window. • Finally check that your phase flipped and Wiener filtered data exist and are correctly stored in the EMAN2 database. Do this by choosing the CTF task and carefully inspecting the displayed table. The number of regular, phase flipped and Wiener filtered particles should match for all image entries in the table as should all particle image dimensions. You may also use the browser to look in the ʻparticlesʼ directory and examine the raw particles, the phase-flipped particles and the Wiener filtered particles.
Getting rid of bad particles and selecting data to use • Expand the Particle Sets entry. You will see “Examine Particles” and “Make Particle Set”. Select “Examine Particles”. • This will prompt you to select what type of particles to look at for this interactive step. Select Wiener filtered particles. • You will now see a window showing statistics for all of the images you have processed so far. Doubleclick on one of these images, and a browser window will open. • Look through the Wiener filtered particles. It should now be much easier to see bad particles as well as particles with other particles too close to them in the box. Left click on each bad particle that you see. You will see a blue mark appear on the particle, and if you look back in the “Examine Particles” window, you will see the bad particle count has increased by 1. ➡This manual particle selection process is optional. If you skip it you will simply include all of the particles in each set. • Once you have marked as many bad particles as you like, move on to the “Make Particle Set” phase. Again, select Wiener filtered particles to view. • In this window, rather than double-clicking on images, select all of the frames for which you wish to process the data in the next stage. This can be all or only a fraction of your data. Clearly the less data you use the faster the processing will be, but the poorer the results will be. • Generating a named output stack from this step will create a ʻvirtual stackʼ file for each of the types of files you have already prepared (original, phase flipped and wiener filtered). The nice thing is that these stacks take very little disk space as they do NOT copy the image data again.
Generating reference free class averages (2D refinement) EMAN2 Tutorial
• Expand the Reference Free Class Averages entry in the EMAN2 tasks window in e2workflow. There is a single entry : Generate Classes – e2refined2d ➡ Note that if you click the Reference Free Class Averages task you will see a form that lists the reference free class averages currently associated with the project. You can add your own reference free class averages to this form using the usual mechanisms (Browse To Add etc). This can be useful if you want to try generating an initial model using class averages generated outside the workflow. • Select the Generate Classes (e2refine2d) task and hit OK. This will pop up a small form asking you to choose from your regular, phase flipped and Wiener filtered particles (you can also specify files, but ignore this option for the purposes of the workshop). Choose either the Wiener filtered or phase flipped data and hit ok. This will pop up a form with three tabs asking you to specify the parameters required to run e2refine2d.py. The default parameters will produce reasonable results. To make it run faster, you may wish to select only a subset of your data, but you may use as much as you like. Select the data and hit OK. This will spawn a process that you can monitor in the Running tasks window. ➡ Mostly you want to be aware of the number of classes generated and how this compares to the total number of particles. Use the particle table to see how many particles are in each particle stack. As a rule of thumb, there should be at least 10-20 particles per class, but larger numbers are also fine. ➡ You can leave most of the options in the Simmx and Class Averaging tabs as they are. However, you do want to think about the shrink option in the Simmx page. For the reference free class averages you can shrink the data to save time. The time savings can be quite substantial with larger shrink values, but the quality degrades as well. ➡ Tool tips display useful information regarding the specific parameters. Just hold your mouse still over one of the parameters for a few seconds to see the tooltips. If you want to obtain more information you can also go to the command line and type e2refine2d.py -h. ➡ If you want more information on the particular parameters you can pass to the aligners and/or comparators (in the Simmx and Class Average tabs) go to the command line and type ʻe2help.py aligners -vʼ or ʻe2help.py cmps -vʼ. Omitting the ʻ-vʼ will produce a less detailed listing. ➡ You can monitor the progress of the refinement in the ʻRunning Tasksʼ window, or by running ʻe2history.pyʼ in the appropriate directory. • Once e2refine2d has finished you will want to view the reference free class averages that were generated. You can do this in a number of ways. The first approach is to choose the Browse option in the EMAN2 tasks window. When the browser pops up you should see a folder called r2d_00. Note that if you have run e2refine2d several times you will most likely see several folders that start with ʻr2d_ʼ and end with a two figure number (such as r2d_01,r2d_02). Navigate into the most recent r2d entry (mostly likely r2d_00 at this stage). The reference free class averages are called classes_init, classes_01, and classes_02 etc. The highest numbered classes file contains the final results. ➡ Another way to view the results of e2refine2d in the workflow is to select the Reference Free Class Averages task in the EMAN2 tasks window. There is one entry in this form for every time you have executed e2refine2d. This entry lists the most recently generated class averages (viewable by double clicking on the entry). Note that this form can also be used to monitor e2refine2d processes as they are running, by allowing you to view the most recently generated class averages during the run.
Making an initial model There is a lot of controversy in the cryoEM community on this point. Some people feel that initial model generation is the most critical step in refinement, and you need to use difficult and time-consuming experimental methods such as random conical tilt or +-45 degree tilt methods to get an initial model before you can proceed. We disagree with this philosophy. In the vast majority of cases, the simple EMAN2 Tutorial
approach used in EMAN can give a very reliable initial model with no additional experiments required. However, there are a few caveats here: • Handedness cannot be determined from single particle data without tilting. If determining handedness directly from your data is important, you will have to do some sort of tilt experiment. Personally I find tomography to be the most appealing approach, since it is becoming very standard in most labs nowadays, and issues with direction of rotation, etc. likely have already been dealt with. You simply need to collect a tomogram with sufficient particles that the handedness can be observed. EMAN2 is beginning to incorporate software for ʻsingle particle tomographyʼ as well. • Heterogeneity is an issue. If you have a particle that is highly heterogeneous, the EMAN initial model strategy is likely to fail to produce a unique answer (since there isnʼt one). Again, single particle tomography may offer the best solution towards understanding the heterogeneity in your specimen. EMAN1 has a ʻmultirefineʼ procedure for refining data with many types of heterogeneity, but this has not yet been reimplemented in EMAN2 (it will be...). • Poor angular distribution. If your particles have a strongly preferred orientation, especially if this is combined with a low symmetry, there may not be enough information to produce an unambiguous starting model. However, it is also important to note that in this situation, even if you get a good starting model, refinement will also tend to degrade rather than improving the model. To perform a proper 3-D reconstruction, you must have a reasonable number of particles in orientations either spanning the equator of the unit sphere, or along a line connecting the pole to the equator (corresponding to a complete tomographic series). We will discuss this point more in the workshop. GroEL is actually a fairly difficult case for the EMAN approach, as it tends to be found predominantly in the side view orientation, with only a few end-on views present, and very few particles in between. This leads to a substantial number of potential bad starting models. However, as you will see, identifying bad starting models can be quite straightforward, and in most cases the correct starting model will be obvious, even without prior knowledge of the shape of your particle. • Expand the Initial Model entry in the EMAN2 tasks window in e2workflow. You should see one entry appear beneath Initial Models called Make Model (e2initialmodel). • You can add any models the the Inital Models table using the ʻBrowse To Addʼ button, etc. • Choose Make model (e2initialmodel) and hit OK. This will pop up a form displaying the available reference free class averages. This form will also require you to input some parameters such as the number of iterations and the number of models to try (to generate). Enter d7 for the symmetry, select the class averages you wish to use for generating the initial model and hit OK. This will spawn a process (e2initialmodel) that you can monitor in the Running task window. • Once the e2initialmodel process has completed, go to the Initial Models task in the EMAN2 tasks window. This will show you a table listing the initial models that were produced by e2initialmodel. You can double click on entries to view the results. Before going to the next stage you should probably decide on your best initial model, this will be used to seed 3D refinement in step 6. In theory, the initial models should be sorted in order of quality. In most cases, you will see that the best scoring model will look very much like GroEL, however, due to the small number of ʻtopʼ views of GroEL, there is one false positive (obviously wrong) which can sometimes score better than the correct solution. By browsing through the results you can see some of the other failure mechanisms of this method. You can also use the browser to go into the initial_models directory and see some other files you can use to evaluate the models. ➡ You can use the Initial Models table to view models that are generated as the e2initialmodel process is running. Try waiting until the e2initialmodel process has completed to 20% and then open this table.
3D Refinement Expand the 3D Refinement entry in the EMAN2 tasks window in e2workflow. Under this the Run e2refine task should appear Select the Run e2refine task. This will pop up a form asking you to choose the raw input data and optionally the ʻusefiltʼ data. You can choose from your regular, phase flipped and Wiener filtered particles (you can also specify files, but ignore this option for the time being). Choose the phase flipped data for the raw data and (optionally) the Wiener filtered data for the usefilt option. Then hit ok. This will pop up a form consisting of 6 tabs that asks you to specify the parameters required to run e2refine.py. ➡There are a LOT of parameters. We have tried to select sensible defaults for all of them. Donʼt forget about tooltips when you need a little more help on a specific option, and donʼt be shy about asking for help. In the Particles tab: • Choose all or a subset of the images. • Enter the number of refinement iterations. • Double check the angstrom per pixel and particle mass. • Select the low mem option on low end machines.
In the Model tab: • Choose the initial model that you think is best. If unsure just double click on the models to look at them. • Make sure the symmetry is correct (d7). • Check automask3d and fill in the associated parameters. Tool tips should be informative. The threshold should probably be 0.8, and the mask dilations should both be about 5% of the width of your particle data. The radius parameter is particle specific, but 30 should be fine for GroEL. In the Project3D tab: • As a general rule of thumb leave the orientation generator and the projector unchanged. • As a general rule of thumb do not include the mirror portion of the asymmetric unit. If you do include the mirror portion of the asymmetric unit you should change all of your aligners from ʻrotate_translate_flipʼ to ʻrotate_translateʼ (see the Simmx and Class Averaging tabs). • Leaving the orientation distribution method as angle based (at 5 degrees) is fine. You can play with this at a later date. You may want to choose a coarser sampling (9 degrees, for example) when youʼre testing the workflow, as this will generate fewer projections and hence the refinement process will proceed more rapidly. In the Simmx page: • Default parameters are fine. In the Class averaging page: • Default parameters are fine. In the Make3D page: • As a general rule o thumb leave the reconstruction technique unchanged (Fourier). Enter the amount of padding to be used by the reconstruction algorithm. For example, if your particle images are 144x144 then a pad value of 192 should be fine. EMAN2 Tutorial
Other default parameters are fine. Once you have filled in all of the parameters in the run e2refine form hit OK. This should spawn the e2refine process which you can monitor in the Running Tasks window. As e2refine is running you can click on the 3D Refinement task in the EMAN2 tasks window to see if any reconstructions have been generated. This table will list the most recently generated 3D model for each 3D refinement you previously run or are currently running. You can explore the output from e2refine using the browser and investigating the directories labeled refined_00, refine_01, etc.
Resolution After at least one refinement iteration has completed click on the Resoution task. This will display a table that has 4 columns titled ʻRefinement Directoryʼ, ʻTotal Iterationsʼ, ʻe2eotestʼ, and ʻe2resolutionʼ. The Refinement Directory column lists directory that store refinement data, you can double click on any of these listed items to look at the convergence plots (FSC curves) for the particular refinement. The total iterations column is just for your convenience, it display how many iterations took place in the refinement directory. The e2eotest column is the result of the most recent e2eotest that you have run. The number displayed is what you would get according to the 0.5 cutoff criterion. You have probably not yet run e2eotest - you you do this by choosing the Run e2eotest task. The e2resolution column is the result of the most recent e2resolution that you have run. The number displayed is what you would get according to the 0.5 cutoff criterion. Update this column by choosing the Run e2resolution task. The ʻRun e2otestʼ task Select Run e2eotest. This will launch a form with three tabs titled General, Class averaging, and Make3d. Because particle classifications are already known for each refinement, all that needs to be done to perform an eotest is to regenerate the class averages and rerun make3d, for each half of the input data. • In the General page choose your refinement directory and the iteration which will be used for the eotest. In general you should select usefilt if you used filtered data in the refinement. Enter the correct symmetry and check lowmem on less powerful machines. • You can probably leave everything as is in the Class averaging page • In the make3d page you can leave everything as is, but make sure to choose a good number for the padding (see hints in the 3D Refinement section above)
Once youʼre done hit OK. You can monitor progress of the task you just launched in the Running Tasks window. Once the e2eotest job is completed, open the Resolution form and double click on the directory where the eotest was just executed. This will enable you to view the newly generated FSC curve, along with any other convergence plots, in the EMAN2 plot window.
The ʻRun e2resolutionʼ task The e2resolution program is a much faster way of getting an estimate of resolution which exploits a relationship that is known to exist between the signal to noise ratio and the Fourier shell correlation. You can only run e2resolution if you have specified the Automask3d option in the Run e2refine form (see the 3D Refinement section of the workflow). The automask3d option causes a mask file to be generated which defines where the particle density is in refined 3D models. This mask is critical to the functioning of e2resolution. If you have specified the automask3d option for your refinements, click on the Run e2resolution task. This will display a form asking for the refinement directory and iteration, and the angstrom per pixel. Make your selection and hit OK. You can monitor progress of the task you just launched in the Running Tasks window. Once the e2resolution job is completed, open the Resolution form and double click on directory where the job was executed. This will enable you to view the newly generated FSC curve, along with any other convergence plots, in the EMAN2 plot window.