使用Nginx Upload Module及pycurl来实现大文件断点上传

jackxiang 2013-4-17 14:10 | |
背景:
   对于Nginx的上传模块,在内存上可能会有不少的缩减和PHP的相比,在性能上也就是上传的速度可能要快2,3倍,但这个模块对认证的配置没有,也就是如有某cookie才能上传等,它仅仅就是一个模块,也就是说这一块还有待hack或作者本人去实现,我看网上也有其它兄弟想用并去试着实现这一块。因为是http里的rfc协议的支持,它也就支持断点续传了,这有一篇文章讲述到这一块,于是做出转载之举动,文章如下:


由于项目需要,需要实现超大文件的上传,且要考虑上传请求的负载均衡、客户端往服务器的断点续传(上行)、服务器可扩展性等需求。对比ftp、自定 义Socket协议、php等服务器脚本实现上传功能后,选择了基于Ngnix Upload Module+pycurl来实现大文件的上传。

1、 nginx、nginx upload module、nginx upload progress module安装
mkdir ~/nginx-source
cd ~/nginx-source
wget http://nginx.org/download/nginx-1.2.7.tar.gz
tar zxvf nginx-1.2.7.tar.gz
wget http://www.grid.net.ru/nginx/download/nginx_upload_module-2.2.0.tar.gz
tar zxvf nginx_upload_module-2.2.0.tar.gz
wget -O nginx-upload-progress-module-master.zip https://github.com/masterzen/nginx-upload-progress-module/archive/master.zip
unzip nginx-upload-progress-module-master.zip
cd nginx-1.2.7
./configure –user=daemon –group=daemon –prefix=/usr/local/nginx-1.2.7/ –add-module=../nginx_upload_module-2.2.0 –add-module=../nginx-upload-progress-module-master –with-http_stub_status_module –with-http_ssl_module –with-http_sub_module –with-md5=/usr/lib –with-sha1=/usr/lib –with-http_gzip_static_module
make
make install
2、 php安装
mkdir ~/php-source
cd ~/php-source
wget http://www.php.net/get/php-5.4.13.tar.gz/from/cn2.php.net/mirror
tar zxvf php-5.4.13.tar.gz
./configure –prefix=/usr/local –with-config-file-path=/etc –enable-suhosin –enable-fpm –enable-fastcgi –enable-force-cgi-redirect –disable-rpath –enable-discard-path –with-mysql –with-mysqli –with-sqlite –with-pdo-sqlite –with-iconv-dir=/usr/local –with-freetype-dir –with-jpeg-dir –with-png-dir –with-gd –with-zlib –with-libxml-dir –with-curl –with-curlwrappers –with-openssl –with-mhash –with-xmlrpc –with-mcrypt –with-ldap –with-ldap-sasl –enable-xml –enable-safe-mode –enable-bcmath –enable-shmop –enable-sysvsem –enable-inline-optimization –enable-mbregex –enable-mbstring –enable-gd-native-ttf –enable-ftp –with-bz2 –enable-pcntl –enable-sockets –enable-zip –enable-soap –enable-pdo –disable-debug –disable-ipv6
3、 nginx配置
user daemon;
worker_processes 1;
#error_log logs/error.log;
#error_log logs/error.log notice;
#error_log logs/error.log info;
#pid logs/nginx.pid;
events {
worker_connections 1024;
}
http {
include mime.types;
default_type application/octet-stream;
#log_format main ‘$remote_addr – $remote_user [$time_local] “$request” ‘
# ‘$status $body_bytes_sent “$http_referer” ‘
# ‘”$http_user_agent” “$http_x_forwarded_for”‘;
#access_log logs/access.log main;
sendfile on;
#tcp_nopush on;
#keepalive_timeout 0;
keepalive_timeout 65;
#gzip on;
upstream web{
server 127.0.0.1:80;
}
upstream php{
server 127.0.0.1:9000 max_fails=0;
}
server {
listen 80;
server_name localhost;
#charset koi8-r;
#access_log logs/host.access.log main;
client_max_body_size 100m;
# Upload form should be submitted to this location
location /upload {
# Pass altered request body to this location
root html;
upload_pass /upload.php;
# Store files to this directory
# The directory is hashed, subdirectories 0 1 2 3 4 5 6 7 8 9 should exist
upload_store /var/uploads 1;
# Allow uploaded files to be read only by user
upload_store_access user:r;
upload_resumable on;
# Set specified fields in request body
upload_set_form_field “${upload_field_name}_name” $upload_file_name;
upload_set_form_field “${upload_field_name}_content_type” $upload_content_type;
upload_set_form_field “${upload_field_name}_path” $upload_tmp_path;
# Inform backend about hash and size of a file
upload_aggregate_form_field “${upload_field_name}_md5″ $upload_file_md5;
upload_aggregate_form_field “${upload_field_name}_size” $upload_file_size;
upload_pass_form_field “^submit$|^description$”;
}
#error_page 404 /404.html;
error_page 405 =200 @405;
location @405
{
root html;
}
# redirect server error pages to the static page /50x.html
#
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root html;
}
# pass the PHP scripts to FastCGI server listening on 127.0.0.1:9000
#
#location ~ \.php$ {
location ~ .*\.php(\/.*)*$ {
root html;
fastcgi_pass 127.0.0.1:9000;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
include fastcgi_params;
}
}
}
创建上传目录
mkdir -p /var/uploads/{0..9}
mkdir -p /var/uploads/{a..z}
chown -R daemon:daemon /var/uploads
启动
/usr/local/nginx-1.2.7/sbin/nginx
/usr/local/sbin/php-fpm
4、 在document root 目录html下创建upload.html和upload.php
upload.html
Select files to upload


upload.php

5、 测试
页面上传测试:
http://127.0.0.1/upload.html
使用curl测试文件上传:
curl -v -i -XPOST http://127.0.0.1/upload -F “media=@/home/liangchuan/baby4.jpg;type=image/jpeg;filename=baby4.jpg
使用pycurl脚本测试大文件分段续传(测试了一个90M的文件,每块按照30M传输):
# -*- coding: utf-8 -*-
import pycurl
import StringIO
import os
import math
chunksize=30000000
filename=’/home/liangchuan/test.rar’
class FileReader:
def __init__(self, fp, start, length):
self.fp = fp
self.fp.seek(start)
self.length = length
def read_callback(self, size):
#print ‘read_callback(%d)’ % size
if self.length == 0: # read all
return ”
if self.length > size:
self.length -= size
#print ‘set size = %d’ % size
return self.fp.read(size)
else :
size = self.length
self.length -= size
return self.fp.read(size)
fout = StringIO.StringIO()
filesize = os.path.getsize(filename)
c = pycurl.Curl()
c.setopt(c.URL, ‘http://127.0.0.1/upload’)
pf = [('test', (c.FORM_FILE, filename,c.FORM_CONTENTTYPE,'application/x-rar-compressed')) ]
c.setopt(c.HTTPPOST, pf)
c.setopt(c.VERBOSE, 1)
num=int(filesize/chunksize)+1
import StringIO
b = StringIO.StringIO()
c.setopt(pycurl.WRITEFUNCTION, b.write)
for i in range(1,num):
c.setopt(pycurl.INFILESIZE, chunksize)
c.setopt(pycurl.READFUNCTION, FileReader(open(filename, ‘rb’), (i-1)*chunksize,chunksize).read_callback)
c.setopt(pycurl.RANGE,’%s-%s’ % ((i-1)*chunksize,i*chunksize))
c.perform()
print b.getvalue()
#response_code = c.getinfo(pycurl.RESPONSE_CODE)
#response_data = fout.getvalue()
#print response_code
#print response_data
c.close()
备注:
1、 上述脚本只是用于测试使用,用于实际的生产场合(例如用于大文件的自动断点传输),在机制上还有需要完善的。大的思路:
服务器端的upload.php通过Redis或数据库等其他机制来存储每块文件上传的状态,并提供查询接口供客户端查询
客户端使用sqlite来维护每块文件的传输状态,失败后,先调用服务器的查询接口查询成功的range值,然后从指定的range值重新发起上传操作
2、 以上pycurl的例子主要是测试后台自动上传功能,如果是基于有Web界面的文件上传,可以参考http://blueimp.github.com/jQuery-File-Upload/

来自:http://blog.sina.com.cn/s/blog_5921b17e01019oil.html

作者:jackxiang@向东博客 专注WEB应用 构架之美 --- 构架之美,在于尽态极妍 | 应用之美,在于药到病除
地址:http://jackxiang.com/post/6265/
版权所有。转载时必须以链接形式注明作者和原始出处及本声明!

评论列表
发表评论

昵称

网址

电邮

打开HTML 打开UBB 打开表情 隐藏 记住我 [登入] [注册]