About two months ago we released new Image Uploader 7.
Looking through the set of new features and improvements you could notice that increased upload speed is one of the major ones. During the last weeks we got several replies from our customers where they asked for additional information about these improvements. In this topic I am going to explain in detail what has changed in Image Uploader with regards to upload speed.
Upload package
First of all, let us take into consideration so-called "upload package" which is a key for understanding of files processing and uploading in Image Uploader. By processing files I mean what the uploader does with files selected for upload: web developers can make the component to resize images prior to upload or compress files to ZIP format. "Upload package" is a set of files selected for upload by a user, these files are uploaded in a single HTTP POST request (if chunks are disabled) and they are optimized all together by Image Uploader prior to upload stage. Following this logic, each "upload package" goes through two main stages: files optimization and files uploading, and upload stage starts only after optimization stage is finished.
By default, Image Uploader sends all files in a single "upload package"; it means that no matter how many files a user selects for upload, they all are processed from the very first to the latest one and they start to upload only after processing is finished. This behavior can be customized; you, as a web site developer, can specify how many files are included to a package (using uploadSettings.filesPerPackage property). You can even set Image Uploader to upload one file per a package; in this case you will have as many packages as many files are selected for upload.
How upload speed depends on number of "upload packages"
If we open HTTP standard and look through RFC1867 - Form-based File Upload in HTML, we will notice that HTTP POST request gives some overhead in terms of amount of uploaded data. This is because that in addition to binary data (in our case it is files and additional information being uploaded from client-side by Image Uploader) each request has to carry data required by the HTTP standard (e.g. HTTP request header, transmission control data, service fields, etc.). This way, the more requests we upload, the greater number of "overhead" bytes we have in the uploaded data. It was cons of using multiple "upload packages".
What are pros of using them? First of all, it is smaller request size. I am sure that each web developer has experienced a problem of setting up a Web server in a way to receive large amount of data. It is a challenge because it requires good understanding how server platform works and there are situations when it cannot be done due to limitation on server side or on hosting. Using multiple "upload packages" (it means multiple HTTP POST requests) allows us to upload less data in each request, so, no matter how many files are selected for upload, each request will go with predefined number of files.
Another advantage of multiple "upload requests" is parallelism. Upload requests do not depend on each other, because of this fact we can process them in parallel. The second part of this post will shed the light to questions of "parallelism" in Image Uploader.
Approaches to multithread files upload in Image Uploader
Image Uploader 6
Image Uploader 6 implements two approaches to process files and to upload them in parallel way:
- Process all "upload packages" in a single thread. So, there is no parallelism here, all packages are processed one by one.
- Image Uploader 6 ActiveX allows using pool of threads to process "upload packages". There is an option here to specify how many threads are in the pool. Each thread first optimizes all files in the current package, and then uploads them. Threads pool works in the following way: as soon as a thread goes to "vacant" state, it tries to obtain next "upload package" currently not started for processing by another thread and starts processing it.
The second approach gives upload time optimization for some kind of file sets. Nevertheless, we found that it gives small benefits for sets of files having similar type, for example, photos made by the same model of camera. This happens because resource-intensive preprocessing operations in multiple threads share computer resources, and it gives serious impact to efficiency. Here is the diagram showing how this multithread processing works.
Image Uploader 7
Design of new approach to multithread files upload in Image Uploader 7 was created with consideration of disadvantages of the previous version. Now Image Uploader (both ActiveX and Java) has two threads for preprocessing and uploading files. The logic was slightly changed: only one thread is uploading data from client-side to a server at any moment of time. Following this idea, while first thread is uploading data, the second thread is preparing next upload package. Here is a detailed diagram how it works:
This approach allows loading computer resources (CPU and memory) in more uniform way, and it leads to better performance of the uploader.
Comparison of upload speed between versions
Here are the results of upload speed checks made on our test environment. I used 200 8Mpix photos made with Canon EOS 350D; Image Uploader was setup to resize photos to 1000x1000 before upload. I made 5 tests for each case using the same machine and listed mean values in the table. We can notice that version 7 gives better results.