ProteoSAFe Jeremy Carver ver. 1.2.6 jjcarver@ucsd.edu June 1, 2013 NOTE: The software contained in this package is being released in a beta state. The software has been thoroughly tested, but may be subject to unforeseen defects or instability. Please feel free to forward any comments or bug reports to: ccms@proteomics.ucsd.edu NOTE: For usability purposes, SSL login security has been disabled in this release version of ProteoSAFe. Most users will wish to use ProteoSAFe on their personal computers, where account security is not paramount, and will prefer not to have to deal with web browser certificate errors. However, if you wish to install ProteoSAFe in a more secure environment with private accounts, then after the system is installed you can uncomment the element in the webapp configuration file found at: /tomcat/webapps/ProteoSAFe/WEB-INF/web.xml (line 306) Removing the XML comment symbols ("") surrounding this section will cause ProteoSAFe to automatically redirect to SSL whenever user account activity takes place, resulting in significantly improved security, at the expense of SSL certificate errors that must be overridden manually by the user. An updated release of this software will soon be available at: http://proteomics.ucsd.edu/Downloads/ProteoSAFe/ 1. Prerequisites a Windows Vista/7 onward or Linux b. Java Platform, Standard Edition version 6+ (JDK/JRE/OpenJDK recommended) c. Graphviz (strongly recommended) d. Up-to-date web browser (Firefox, Safari, Chrome, Internet Explorer, Opera, etc). The viewing experience has been optimized for Firefox v3.6+, so that browser is recommended. e. 1GB+ free memory f. 60MB+ disk space (installation only) 2. Installation a. Extract the archive Find the archive that matches your platform (Windows or Linux, 32-bit or 64-bit), and extract it to a folder you prefer. After extraction, you will find the following subfolders data : Storage for user data, tasks, results, HSQL data demo : Default demo tasks hsql : Default HSQL database when it is absent in [data] logs : Storage for logs server : Specifications, resources, and utilities for system services tomcat : Apache Tomcat 6 Web Server tools : Executables and resources for CCMS tools worker : Tool specifications and execution storage workflows : Workflow specifications In addition, you will also see two scripts (.bat for Windows, .sh for Linux). You invoke them to start up or shut down the system. b. Set environment variables Six environment variables are available to configure the system: * CCMS_HTTP_PORT * CCMS_HTTPS_PORT The system is accessed via Web interface. The default HTTP/HTTPS ports for the system is 8080 and 8443. If your have other Web servers installed on your machine and occupying either ports, configure them accordingly. (See SSL security note at the top of this file.) * CCMS_TOMCAT_PORT Tomcat listens to a port for termination directives. (Default: 8005.) * CCMS_DAPPER_PORT An internal process will use port 10100 to coordinate workflow executions. If for any reason that the port is unavailable for the system, configure it accordingly. * CCMS_STORAGE_PATH To store data, tasks, results, and HSQL database to a non-default path, configure this environment variable. * CCMS_DOT_PATH It is recommended to install Graphviz; it allows the system to draw tasks progress via the "dot" utility. If you have Graphviz installed but the "dot" program is not found in system path, point this environment variable to where "dot" is located (its parent folder, not the executable itself). In addition to these environment variables, you should also set JAVA_HOME or JRE_HOME, or point system path to JVM. Please see Java documents for more details. For instance, in Windows command-line environment, execute this before starting the system: SET JAVA_HOME=C:\Program Files\java\jdk1.6.0_25 In Linux bash shell, JAVA_HOME=/usr/lib/jvm/java-6-sun If you are not familiar with Windows/Linux environment variables, this is a good start: http://en.wikipedia.org/wiki/Environment_variable c. Advanced Settings * CCMS_MAX_EXEC This threshold controls how many tool instances can be are allowed to run in parallel. The default is 3. If your machine has multiple cores or hyper-threaded, you can increase the number. However, the higher the threshold is, the more CPU/memory is consumed; you might need to increase CCMS_TOMCAT_MAX_MEM too. * CCMS_JAVA_MEM This parameter limits the memory that Java virtual machine can use. This parameter would be passed to the underlying Java virtual machine as option -Xmx. This parameter can be an integer specifying memory size in bytes, or suffixed with a unit character 'k', 'K', 'm', 'M', 'g', 'G' for kilobytes, megabytes, or gigabytes, for instances: 256000000, 256m, 1G. The default value is 256m. d. Install extra protein sequences For frequently-used protein sequence databases, you can choose to install them. After installed, you can select them easily for your tasks via a pull-down menu. The sequence database has to be FASTA files. To install them, create a folder under [server/resources/sequence] and then copy your FASTA files to that folder. The folder name will be used in the pull-down menu to identify this set of sequence databases. You can bundle multiple related sequence databases into a set; when selecting the set, these databases are all selected. You can find and install the all species UniProt/swiss-prot FASTA file included in the CD distribution. 3. Execution a. Start the system To start the system, invoke startup.bat or startup.sh. The script can be invoked anywhere; it can pick up its path automatically. There is no need to change or set working directory. Wait for about 10 seconds to 1 minute till the server is fully up. b. Use the system The system provides services via Web interface; you need a web browser to use the system. The system is accessible at the URL: http://localhost:8080/ProteoSAFe You can visit the system via Secure Sockets Layer (SSL) at the URL: https://localhost:8443/ProteoSAFe If you use different ports for HTTPS or HTTPS as instructed in (2b), remember to use these ports you specified. (See SSL security note at the top of this file.) If you have manually enabled SSL within the web application, then the first time you access the system with each individual browser, always use the HTTPS scheme instead of the default HTTP scheme. Not doing so might cause your browser to block the sign-in functionality. When doing so, the browser will ask you whether to trust the site. You should instruct your browser to trust our system; please check browser documents on details of trusting site certificates. To sign in the system or register for an account, use the login box on the upper right of the logo banner. You can also use the default administrator account account : default password: defaultpassword c. Shut down the system To shut down the system, invoke shutdown.bat or shutdown.sh. Same as the start-up script, this script can be invoked anywhere. d. Clean up If the system has used up too much disk space, you can clean up some space by i) Delete tasks you don't need any more. Just sign in as the administrator and browse the task, and you will see a delete link. Follow the link and then the task is deleted. Be careful; when a task is deleted, it's gone for good. ii) Delete log files in [logs], [tomcat/logs] or temporary files in [worker/local] and [worker/tasks]. Make sure the system is off when you perform the deletion. 4. Update When a new version is released, you can update your system by extracting the new archive to overwrite the old installation. Your data and tasks won't get lost. 5. Copyright, and License Notices Please refer to 'COPYRIGHT' and 'NOTICE' for copyright notice and license notices for 3rd-party software. 6. Known Problems Tasks in execution will fail after hibernation/resuming 7. Feedback For suggestions and questions, please contact us at ccms@proteomics.ucsd.edu. We will record known problems as well as solutions on http://proteomics.ucsd.edu/ProteoSAFe_problems.html 8. Trademark Notice Oracle and Java are registered trademarks of Oracle and/or its affiliates. OpenJDK is a trademark or registered trademark of Sun Microsystems, Inc. in the United States and other countries. Linux is the registered trademark of Linus Torvalds in the United States and other countries. Windows, Windows Vista, Windows 7, and Internet Explorer are registered trademarks of Microsoft Corporation in the United States and other countries. Google Chrome is a trademark of Google Inc. Use of this trademark is subject to Google Permissions. Apache Tomcat is a trademark of the Apache Software Foundation. Cygwin is a registered trademark of Red Hat, Inc. Firefox is a registered trademark of the Mozilla Foundation. Graphviz is a registered trademark of AT&T. Opera is a registered trademark of Opera Software. Safari is a trademark of Apple Computer, Inc. SWISS-PROT is a trademark of Institut Suisse de Bioinformatique (ISB). Wikipedia is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.