This page presents various options on specific commands or parameters of Waarp R66 to tune to fit your needs.
With these options (usecpulimit), Waarp R66 Server can limit the new requests according to a threshold on CPU global usage and on a maximum number of concurrent requests. When one of these limits is exceeded, the request is refused and postponed for a random number proportional to 30 seconds (option timeoutcon). After 3 retries, the request is cancelled.
These tests are done both on requester and requested side.
- cpulimit: Value for CPU is in float between 0 and 1 (% of CPU) where 0 or 1 means no limit.
- usejdkcpulimit: CPU is computed either with native support from JRE (but not all JRE support this) or either from Java Sysmon library.
- connlimit: Value for connexion is starting from 0 where 0 means no limit.
By default, real IP address used by the remote host is not compared against the IP (or the resolution given from DNS) stored with the remote HostId. If for security reasons this is required, you can enable this check to be done. When activated in checkaddress:
- For a server, the test will always be done.
- For a client, it depends if you enable it through the specific option checkclientaddress, then 2 cases occur:
- If the client has an address of "0.0.0.0", then no test is done (this special address is to be used when a lot of remote clients are to be used without high security challenge).
- If not, then the check is done as for a server.
In version 2.0, a new property is attached to the Host definition: isclient. This property stands to recognize remote client to prevent a server to try to initiate a request to this client. Since this client is not a server, it is not listening to incoming requests and therefore cannot be the target of an incoming request.
The special address "0.0.0.0" can also be used to specify a client but it extends it to not try to check its IP address. It is useful in the situation where you have a lot of clients and you don't want to declare them all in the Host table. All those clients will have to share the ID and the password (and the SSL key in case of strong authentication).
In version 2.0, many improvements were done on cryptographic side.
- SSL: You can now have simple SSL support or even using string authentication of clients using the trustuseclientauthenticate option.
- Password: all passwords are crypted using a private DES Key (private to each server but can be the same if you want). The passwords are crypted both on files, database. This key is locally referenced by cryptokey option. You have to use the GoldengatePassword tool (see GoldenGate Commons) to crypt your passwords (at least for the administrator password) or to use the administration web interface.
- On the network, a single way crypto key is used, common to all OpenR66 hosts when they need to exchange passwords.
The special option taskrunnernodb allows you to have a persistent view of the transfer (the task) for Thin Client without database by using a XML file. It allows to restart a stopped or canceled transfer smoothly without starting a new transfer from the beginning.
For special project where no database is needed, but then loosing the ability to store all transfer status and associated capacity (such as automatic retry if not specified in the rules), the server code now supports to not have any database connections. However it is still possible to store the trace of the transfers within XML files, such that one can automatize some actions in regard to the status of each transfer through file analysis. Note however that some functionality like the monitoring will be limited. The option taskrunnernodb will allow to use XML files support if set to true. If set to false, no information will be retained out of memory of the process.
Although an interrupted request restarts and each received packet should have been previously validated, the protocol include a backward move on the packet rank in order to ensure the quality of the transfer without doubt. When a transfer is restarted, OpenR66 first checks the receiver rank and takes it as the reference minus a gap. Then it checks the existing file (reception side) and check again against this rank minus a gap. You can control this feature by specifying the block size (blocksize) and the gap of rank (gaprestart), where the retransmitted byte size will be: block size x rank.
In order to improve the efficiency of external commands execution by preventing the fork of the Waarp R66 JVM process (costly regarding its memory), there is an optional support to execute those commands through a Waarp LocalExec Daemon (see in Waarp's Family). To use this support, you have to define the following:
- uselocalexec (default="False") to enable the use of LocalExec support
- lexecaddr (optional) which could contains the address (default="127.0.0.1")
- lexecport (optional) which could contains the port to be used (default="9999")
FastMD5 is no longer recommended. In order to improve the efficiency of the computation of MD5 on blocks during file transfer, 3 methods are provided.
- usefastmd5=False => Will use internal JDK support of MD5 computation
- usefastmd5=True but fastmd5 not declared or empty => Will use Java MD5 implementation more efficient than JDK version
- usefastmd5=True and fastmd5=path to SO file or DLL (according to the systems) => Will use JNI C MD5 implementation suppose to be more efficient than version 2.
However note that the results could depends according to the systems and the JDK. Most of the time, Version 3 is the fastest, but sometime Version 2 is better than Version 3, and always better than Version 1. The efficiency of Version 3 greatly depends on the compilation of the C module. See GoldenGate Digest in GoldenGate Commons
It appears with recent versions of JVM that JDK in server mode is really efficient, almost equivalent to C JNI version. So one might use by default usefastmd5=False.
Note also that a new option can specify which kind of digest you want to use (it is for now a global option, not a local option): digest where values mean: MD5=0, MD2=1, SHA1=2, SHA256=3, SHA384=4, SHA512=5, CRC32=6, ADLER32=7
In case a database is shared among several R66 servers (with different names, so not in Multiple Monitors support option), the following tables will be totally shared:
Host table: all partners definition, including itself will be shared among all servers. It implies also that the Key used to crypt/uncrypt the password are the same for all servers sharing the database.
Rules table: all rule definitions will be shared. The difference of real action could be done either on the recv/send part of the tasks, but also using local variables (see the R66 Task Options) or local scripts
The following tables, even if shared, will have different entries for each server:
Configuration table: the bandwidth limitation will be independent for each server
Runner table: each transfer will be owned by one server only. Even if 2 servers are partners for the very same transfer, there will be 2 lines in the database, one for each server (requester and requested).
MultipleMonitor table: this table is of no use in case of no multiple monitor usage ; in case of multiple monitor usage with several clusters on the same database, each cluster will act as a single host (so sharing or not sharing accordingly) and one line per cluster will be setup in this table.
Note that database sharing implies that multiple servers will access to one database through network. The locality (same datacenter at least) is an important property since the implied latency could impact the performance. Moreover the database shall be "string" enough to accept such concurrent accesses (CPU, memory, disk speed, even the database technology, PostgreSQL being recommended for such configuration). Therefore the server hosting the database and the database configuration itself shall be done with this goal in mind.
In order to improve reliability of the OpenR66 File Transfer Monitors and the scalability, we propose a new option that allows to spread the load behind a Load Balancer in TCP mode (as HA-Proxy) and a shared storage (as a simple NAS).
multiplemonitors=1 => No multiple monitors will be supported (single instance)
multiplemonitors=n => n servers will be used as a single instance to spread the load and increase the high availability
Note that some specific attentions are needed such as to share the IN, OUT and WORK storages such that any servers can act on those files and any other storages that must be shared from the beginning of the transfer (pre-task) to the end of the transfer (post-task), and as to configure correctly the Load Balancer in TCP mode such as to spread the load and keep the connection once opened between 2 partners.
The principle is as follow:
Put in place a TCP load balancer that allows to maintain a TCP connection with one server behind and that allows to spread the new connection attempts on the pool of servers available. The algorithm to spread the load could be for instance: the less connections opened at that time. A detection on open port could be enough to test the availability of the service. In more complex configuration, one could also implement a Java method that will do a “message” call to the proposed target server in order to test its availability.
The IP/Port of the load balancer service will be the IP/port of the R66 service, behind it, you will have a set of R66 servers with their own IP/port couples internally (on which the load balancer spreads the load).
Note that if the LB is “transparent”, meaning the IP from the client is not changed from the real server behind, the IP check could be possible on that pool of servers. Reversely, if the IP shown to the real R66 server is the one from the LB, the IP check will not be possible. However note that if the LB is transparent, it does not prevent that the client might still see the LB IP, and not the real R66 server's IP, and therefore preventing the IP check on client side. So a particular attention is needed if one wants to enable IP checking while using a LB in front of a pool of R66 servers.
All R66 servers behind the LB will share the exact same name (ID), both for non SSL and SSL, and will share also the same database. They will have to share also the IN, OUT and WORK directories, and probably any other resources needed for the pre, post and error tasks (through a NAS for instance).
All R66 servers will have to specify the same multiplemonitor option with the number of servers in the pool.
In theory, this enables the following HA capabilities:
A load balance of all transfers among several clusters (horizontal scalability).
A restart on disconnection, even on a crash of the original R66 server, since the new connection will go through the LB algorithm.
Note however that obtaining a HA R66 service does not required absolutely to have this option enabled. Indeed, one could check regularly through a monitoring tool that the service is still responding (using e message for instance), and if not to stop/restart the R66 service accordingly. Since the restart of a request is merely related to the “timeout” time, a check roughly repeated at that interval should enable a “clean” HA availability without having the complexity of a LB configuration.
One could also mixed the two solutions, in order to restart one unresponsive server in the HA pool.
The “usethrift” option (specifying a port > 0) allows to enable the Thrift server support in one R66 server. Currently only Synchronous Binary thrift protocol is allowed, so a client should use “Tsocket” for its Ttransport and “TbinaryProtocol” for its Tprotocol.
One example in Java is given in org.waarp.openr66.protocol.test.TestThriftClientExample to show how to interact with the Thrift R66 service.
This option should enables more capabilities to R66 to be embedded in existing applications, in a more large cover than just Java.
The current methods available are:
transferRequestQuery: allows to initiate a submitted transfer from the related R66 server to another one. Note that the request could be asynchronous (immediately returns once the request is submitted) or synchronous (returns only once the request is done, whatever in error or in success, but it does not take into account future reschedule if any as it will return the status once the current try is over).
InfoTransferQuery: allows to request some information on one particular transfer request
isStillRunning: allows to quickly have the information on one particular transfer request running status or not
infoListQuery: allows to get the information on file list on the local R66 server
Note that the Thrift service, for security reason, is only opened on 127.0.0.1 address since no authentication is made, as it stands for a local service.
In the early stage of the transfer, some tests are made to validate that the request is valid. Among those tests, there are:
incompatible rule setup between the 2 partners (both requiring they act as the sender for instance)
file not found or not readable
request already started or already totally finished
In those case, previously, no error tasks will run. Now, we decide to enable error tasks to be run, in particular to enable the “reschedule” task. However, if one is willing to not have such error tasks running (globally for all rules) in such condition (before any pre task is executed), it will have to pass the following option to the R66 server java command:
With R66, it is possible to instantiate R66 as a Windows service through Apache Commons Daemon.
In the source of R66, you will find in the directory org.waarp.openr66.service the script service.bat. This script has to be updated to reflect your configuration.
rem -- DO NOT CHANGE THIS ! OR YOU REALLY KNOW WHAT YOU ARE DOING ;)
rem -- Organization:
rem -- EXEC_PATH is root (pid will be there)
rem -- EXEC_PATH\..\logs\ will be the log place
rem -- EXEC_PATH\windows\ is where prunsrv.exe is placed
rem -- DAEMON_ROOT is where all you jars are (even commons-daemon)
rem -- DAEMON_NAME will be the service name
rem -- SERVICE_DESCRIPTION will be the service description
rem -- MAIN_DAEMON_CLASS will be the start/stop class used
rem -- Root path where the executables are
rem -- Change this by the path where all jars are
rem -- Service description
set SERVICE_DESCRIPTION="Waarp R66 Server"
rem -- Service name
rem -- Service CLASSPATH
rem -- Service main class
rem -- Path for log files
rem -- STDERR log file: IMPORTANT SINCE LOG will be there according to logback.xml
rem -- STDOUT log file: IMPORTANT SINCE LOG will be there according to logback.xml
rem -- Startup mode (manual or auto)
rem -- JVM option (auto or full path to jvm.dll, if possible pointing to server version)
rem example: C:\Program Files\Java\jdk1.7.0_05\jre\bin\server\jvm.dll
rem -- Java memory options
rem -- Logback configuration file: ATTENTION recommendation is to configure output to STDOUT or STDERR
rem -- R66 configuration file
rem -- prunsrv.exe location
rem -- Loglevel of Daemon between debug, info, warn, error
In order to facilitate the integration in application modules, OpenR66 now supports the ability to run specific Java Class through 3 ways. Note that this functionality is only valid starting in version 2.3.
One is through pre or post or error tasks using the EXECJAVA keyword, following by the full class name which must implement the R66Runnable interface.
Another one is through specific R66Business command, which will also execute an R66Runnable implementation, through for instance the AbstractExecJavaTask abstract class that could be extended.
Finally, there is the possibility to associate a Business Class (see R66BusinessInterface) through a Business Factory (see R66BusinessFactoryInterface) to each transfer that will run several methods in the various steps that could occur:
void checkAtStartup(R66Session session): launched at the very startup of the transfer and before the pre commands
void checkAfterPreCommand(R66Session session): launched after the pre commands and before the transfer starts
void checkAfterTransfer(R66Session session): launched after the transfer is finished and before the post commands
void checkAfterPost(R66Session session): launched after the post commands and before the end of the request
void checkAtError(R66Session session): launched once an error occurs
void checkAtChangeFilename(R66Session session): launched if the filename is changed during the commands (pre or post)
void releaseResources(): launched at the very end, to release any internal resources that should be released
String getInfo() and void setInfo(String info): launched by programmatic (business code) to enable to set a special info (as String) and to retrieve it at any time.
Note that to allow a host to call a Business Request, it has to be added in the configuration file as <business><businessid>hostname</businessid>...</business>. If not set, the host will not be allow. On EXECJAVA, the security is first that the rule is only local to the host, and second the rule has the possibility to limit the allowed hosts to be partner of it.
With R66, it is possible to forward or receive a file by FTP. In passive mode (an external FTP client connects to the server to initiate a server), one can use the Waarp Gateway FTP. In active mode (the server will connect to a remote FTP server to initiate a transfer), one can use the integrated FTP Client (based on FTP4J) as a Task after or before a file transfer. This client is compatible with FTP, FTPS and FTPSE. In R66, the task is named FTP. See in Waarp Commons the package Waarp Ftp Client.
With R66, it is possible install a proxy/reverse proxy in a DMZ in order to ensure high level of security. This Proxy/RP will forward any request to the target defined. No database is needed for this R66 Proxy. See in Waarp Commons the package Waarp Proxy R66.
With R66, it is now possible (from 2.4.10) to have only one Digest per file transfer. Previously it was optional and only by packet. While this is still possible, it is also optional to have one global digest computed efficiently for the full file transfer. This option is activated by default but could be deactivated by using the <limit><globaldigest> entry set to false or 0.
With R66, it is now fully supported (from 2.4.10) to send a request of file transfer to itself, whatever using a direct file transfer or a submitted file transfer.
Previously, filename transmitted should not have any "blank" character since they could introduce some issues. In order to allow such characters in the filename, a change that could lead to backward incompatibility was made from version 2.4.13. This change now uses the ':' as separator (considering this character is not allowed in most filename implementations). The code keeps the possibility to still accept blank character as separator (as previously to version <= 2.4.12) and is therefore backward compatible. However, if one wants to stay with the old way, one can force the R66 server to use the old blank way by specifying the following property on java command:
From 2.4.14, the separator ';' is used now instead of ':' since we make a mistake (Windows usage of ':').
When someone wants to stop his Waarp server, he/she might want to wait first that all requests are over and finished. To do that, it is possible now to ask the Waarp server to block all new requests while letting the existing one to continue. This operation is reversible and he/she can unblock as well. This can be achieved either through the web administration interface, or through the ServerShutdown command using the extra '-block' or '-unblock' option.
It is possible, up to a certain extent, to make Waarp compatible with internationalization. To change the default locale, use the following property at launch time:
Where xx can be one of "en", "fr".
Note that commands can also have a special extra argument: the output format as one of
-csv : output will be as one line for the title, one line for the data, all fields separated by ';'
-property : output will be one value per line, as name=value
-xml : output will be in XML format
-json : output will be in JSON format (default)
-quiet : no output will be done (only logging)
You can use '*' and '?' standard wildcard characters in your request. Note however the following issues and way to fix the issue:
- When using wildcard characters combined with DirectTransfer or SubmitTransfer, the command is in error if the result gives multiple files. To enable multiple files resolution, use MultipleDirectTransfer or MultipleSubmitTransfer. Note that on MultipleSubmitTransfer, if the request is a "RECV" request, you can specify the option "-client" which allows the MultipleSubmitTransfer to run a RequestInformation first to the remote partner in order to get the list of remote files.
- In some special cases, wildcard characters are badly interpreted (Apache Commons or Shell first level of interpretation), for instance as "*" , "*.*" or "*xx" . In particular, the shell might replace immediately the value, which is not the desired result. In order to allow a "non-interpreted" wildcard character, Waarp allows to use the char '§' in place of '*', so for the previous examples giving "%", "%.%" , "%xx". You can of course still continue to use '*' and '?'.
- Each host name specified here will have the ability to make business request (special Java Class to handle B2B functionalities). This information could be passed through the XML configuration file or through the Business field of the Host configuration in the database (System Menu). The format is:
- If specified for one host, this will override database roles. By default, local server should be added as role = FULLADMIN in XML file. This information could be passed through the XML configuration file or through the Roles field of the Host configuration in the database (System Menu). The format is:
Where idx is an host id (1 by 1) for which you require to override default database roles
- Where rolesSet is a set of roles, with separators as blank or '|'
- The roles assign to this host between NOACCESS, READONLY, TRANSFER, RULE, HOST, LIMIT, SYSTEM, LOGCONTROL, PARTNER(READONLY,TRANSFER), CONFIGADMIN(PARTNER,RULE,HOST), FULLADMIN(CONFIGADMIN,LIMIT,SYSTEM,LOGCONTROL)
- Example: PARTNER|LOGCONTROL
This will allow alias usage for host ids. This information could be passed through the XML configuration file or through the Aliases field of the Host configuration in the database (System Menu). The format is:
Where realId is the real host id that will have aliases (locally defined).
- Where aliasSet is a set of alias, with separators as blank or '|'
- Example: alias1|alias2
- By default, this field contains the <root><version>version</version></root> xml information, handle by Waarp to check the database configuration version compared to the Waarp program, in order to allow automatic update.
- Note that automatic update could be prevented by setting in XML configuration file <db>dbcheck>False</dbcheck>...</db> or through Java property -Dopenr66.startup.dbcheck=0
- In case the database is shared among several R66 servers, to be able to see all transfer logs from the Administration Web interface, you need to set a special option in the "Other informations" with the identifier tha will be used to connect to this web interface.
Such containt will allow any of the ids id1, id2, ... or idn to see, once connected to the administration web interface, the full content of the database from the transfer menu. Note however that those ids need to have also the CONFIGADMIN role since this ability has to be controlled (see Roles item to see how to configure the roles).
While a server/client should have an acceptable limit for opened file per process (or globally for the system), we recommend to have at least a value of 4096, and more depending on your use case.
However you can limit the NIO selector file descriptor opened impact by decreasing or setting the serverthread value. For instance, a value of 1 will imply around 250 descriptors, while a value of 5 will imply around 850 descriptors.
According to your needs, you can setup the number of cores to use (serverthread), default being number of cores * 2 + 1, but a value of 1 as minimum is allowed (0 meaning using the number of cores).