月度归档:2010 年三月

Google Sitemap Generator 安装使用详解

昨天研究了 Google Sitemap Generator  这一由 Google 自身推出的seo工具包。

其不是一个网页脚本或插件。而是一个安装完后即可单独运行的服务。

用于配合 Apache & 其日志更有效地规划站点 Sitemap 地图供 google 以及其他搜索引擎的分析。

现在已经有 For Windows ,Linux,LinuxX64 的版本可供选择。

首页:http://code.google.com/p/googlesitemapgenerator/

下载:http://code.google.com/p/googlesitemapgenerator/downloads/list

以下以 RedHat As5.4 X64  +  Apache 2.2.11 为例,从安装到部署给大家讲解一次。

1,直接从google下载该软件

# wget http://googlesitemapgenerator.googlecode.com/files/sitemap_linux-x86_64-beta1-20091231.tar.gz
# tar zxvf sitemap_linux-x86_64-beta1-20091231.tar.gz
# cd sitemap-install
# ./install.sh

按2下回车,然后阅读协议,按 Y 接受,开始安装向导。

# What is the location of the Apache binary or control script? []
输入 apachectl 的路径,这对应你 Apache 安装的路径。例 /usr/local/apache2/bin/apachectl

回车后返回以下检测到的信息:

The following information about your Apache installation has been detected:
  * Apache version: 2.2
  * Apache architecture: 64 bits
  * Apache root configuration file: /usr/local/apache2/conf/httpd.conf
  * Apache group: apache
***************************************************************************
Is all of this information correct? If you answer No, installation will
terminate and you’ll need to restart the installation, using the necessary
command line options. [N/y] Y

确认信息,按 Y 继续。

—————————————-

Google Sitemap Generator will start creating Web Sitemap files as soon as it
starts up. Do you want Google Sitemap Generator to start submitting these
files automatically? There are three options:
1.  First installation. Start with automatic submission disabled.
2.  First installation. Start with automatic submission enabled.
3.  Reinstallation. Use the old automatic submission settings.

Specify your choice [1]:2
选择安装及启动方式,我选择第二个,第一次安装,开机启动,以及自动开始 Sitemap 生成。

—————————————

Apache configuration successfully updated.
Old configuration is saved at /etc/google-sitemap-generator/httpd.install.conf

安装向导把 httpd.conf 修改了,其实是在其最后一行加入了个 include 引用。
并且在修改之前自动把该文件复制到 /etc/google-sitemap-generator/httpd.install.conf。
如果卸载 Google Sitemap Generator ,卸载程序会把该文件复制回去。

Ready to set the password for the administration console.
Password (5 or more characters):
Confirm password:
设置密码,至少五位。

—————————————–

Google Sitemap Generator daemon successfully started.
To start the Google Sitemap Generator module in Apache, you must restart Apache.
After you restart Apache, you can go to http://<this-server-address>:8181/ to
configure the application.
Google Sitemap Generator (Beta) was successfully installed.

安装完毕
Google Sitemap Generator  的程序以在后台运行。
程序真正运行,需要重新启动 Apache 令 mod_sitemap 跟配置文件起效。
重启 apache 后在本机访问 http://localhost:8181  再输入你刚才安装时候填写的密码即可登录。

因为安全规则,默认 Google Sitemap Generator   是不允许远程连接的,会提示 Remote access is denied. 
所以我们还需要再配点东西:

进入其默认安装路径 /usr/local/google-sitemap-generator/bin
执行以下命令
# ./sitemap-daemon remote_admin enable
这么就能打开远程登录权限
在本地中重新输入:http://你服务器地址:8181  即可打开登录页面。

一进入,就会显示你在 Apache 上所配置的所有虚拟主机的名称。
理应对应主机头,个个都不一样,但如果发现个个都是 Localhost 或者都跟 httpd.conf  中的ServerName 名字一样,
请在 各个虚拟主机配置中另外加入 ServerName,例如:

<VirtualHost *:80>
    ServerAdmin qbanke@163.com
    DocumentRoot /data2/web_server/admin
    ServerName www.gznow.org
    ServerAlias www.gznow.org
</VirtualHost>

如果你没这个烦事,可以接着下去。

点击其中一个主机头进入配置。

第一页是该主机头的运行状态。

默认开始了 Webserver filter 收集。
你还能开启下边的 Log parser 日志分析。
文件扫描器 File scanner。

我开启了 Webserver filter   Log parser
开启方式:点击左边的 Site configuration 

 

 

Pathname for log file(s)  填入服务器上属于该主机头 httpd 日志的地址。
Webserver filter  [ √ ]
Log parser             [ √ ]

把上边2个勾了save后就行了。

 

 

这个选项页就完成了,然后下一步。

到左边 Sitemap types
首先来到 Web
这一项相当重要,Google Sitemap Generator  会按照 Apache httpd.conf 的配置找到主机对应的程序根目录。
并且在该目录下生成 sitemap 地图文件已供搜索引擎抓取,顺带生成 robots.txt  ,如果该文件已存在,
就往最后+一句指向该域名根目录下的 sitemap 文件地址。

下边来设置参数:

首先是 schedule ,配置的是更新的频率。
1小时到一天随你,至于抓取结果怎样,还真有待研究。

Sitemap file settings 是配置 Sitemap 的文件名,随你改,还有就是记录行数,跟大小。
文件体积越小,有利于 Google 获取的成功率。
行数多,有利于短时间内让 Google 增加收录数。(查询收录数方式,打开g.cn 输入 site:www.xxx.com)
自己衡量了。

Sitemap file submission 就是让其在 robots.txt   中加入 Sitemap 文件的路径。
我的被改完之后多加了一行:http://www.gznow.org/sitemap_google.xml.gz # Added by Google Sitemap Generator

Save 保存后,Web 内容的属性就配置完成了。

如果你在 google 上有开通了网站管理员工具,可以将相对于的 sitemap 文件添加到配置中。
当然你不+也行,Google 跟其他搜索引擎蜘蛛 会按照 roboots.txt 自己找。

往后的 Mobile, Code Search, Blog Search 是代表不同类型网站的收录配置。
这就要看你站的内容了,如果是手机wap站,就吧 Mobile 也配上吧。有搜索引擎的,就 Code Search,剩下的是博客。

 至此,Google Sitemap Generator  的按照以及配置就说完了。

它会在服务器后台分析 Apache 的动态生成更加有效的 sitemap 供 Google ,Yahoo,Ask,Live 所获取。
理论上应该能加强你网站的 SEO 效能。

不过我也只是刚上手数十小时,不知道功效如何。欢迎大伙们用后都来分享。

对了,忘记说一说,
如果 httpd.conf 的配置有修改,例如增加删减了虚拟主机,除了Apache 要重启外。
Google Sitemap Generator 也需要

Apache 2 with SSL/TLS: Step-by-Step, Part 1

or more than 10 years the SSL protocol has been widely used for the purpose of securing web transactions over the Internet. One can only guess how many millions or billions of dollars in transactions are processed per a day using SSL. Unfortunately, the simple fact we use SSL does not necessarily mean that the information sent over this protocol is secure. The use of weak encryption, the impossibility of verifying web servers’ certificates, security vulnerabilities in web servers or the SSL libraries, as well as other attacks, may each let intruders access sensitive information — regardless of the fact that it is being sent through the SSL.

This article begins a series of three articles dedicated to configuring Apache 2.0 with SSL/TLS support in order to ensure maximum security and optimal performance of the SSL communication. This article, part one, introduces key aspects of SSL/TLS and then shows how to install and configure Apache 2.0 with support for these protocols. The second part discusses the configuration of mod_ssl, and then addresses issues with web server authentication. The second article also shows how to create web server’s SSL certificate. The third and final article in this series discusses client authentication and some typical configuration mistakes made by administrators that may decrease the security level of any SSL communication.
Introduction to SSL

Secure Sockets Layer (SSL) is the most widely known protocol that offers privacy and good reliability for client-server communication over the Internet. SSL itself is conceptually quite simple: it negotiates the cryptography algorithms and keys between two sides of a communication, and establishes an encrypted tunnel through which other protocols (like HTTP) can be transported. Optionally, SSL can also authenticate both sides of communication through the use of certificates.

SSL is a layered protocol and consists of four sub-protocols:

SSL Handshake Protocol
SSL Change Cipher Spec Protocol
SSL Alert Protocol
SSL Record Layer

The position of the above protocols according to the TCP/IP model has been illustrated on the following diagram in Figure 1.

Figure 1. SSL sub-protocols in the TCP/IP model

As the above diagrams shows, SSL is found in the application layer of the TCP/IP model. By dint of this feature, SSL can be implemented on almost every operating system that supports TCP/IP, without the need to modify the system kernel or the TCP/IP stack. This gives SSL a very strong advantage over other protocols like IPSec (IP Security Protocol), which requires kernel support and a modified TCP/IP stack. SSL can also be easily passed through firewalls and proxies, as well as through NAT (Network Address Translation) without issues.

How does SSL work? The diagram below, Figure 2, shows the simplified, step-by-step process of establishing each new SSL connection between the client (usually a web browser) and the server (usually an SSL web server).

Figure 2. How SSL established connections, step-by-step.

As you can see from Figure 2, the process of establishing each new SSL connection starts with exchanging encryption parameters and then optionally authenticating the servers (using the SSL Handshake Protocol). If the handshake is successful and both sides agree on a common cipher suite and encryption keys, the application data (usually HTTP, but it can be another protocol) can be sent through encrypted tunnel (using the SSL Record Layer).

In reality, the above process is in fact a little bit more complicated. To avoid unnecessary handshakes, some of the encryption parameters are being cached. Alert messages may be sent. Ciphers suites can be changed as well. However, regardless of the SSL specification details, the most common way this process actually works is very similar to the above.
SSL, PCT, TLS and WTLS (but not SSH)

Although SSL is the most known and the most popular, it is not the only protocol that has been used for the purpose of securing web transactions. It is important to know that since invention of SSL v1.0 (which has never been released, by the way) there have been at least five protocols that have played a more-or-less important role in securing access to World Wide Web, as we see below:

SSL v2.0
Released by Netscape Communications in 1994. The main goal of this protocol was to provide security for transactions over the World Wide Web. Unfortunately, very quickly a number of security weaknesses were found in this initial version of the SSL protocol, thus making it less reliable for commercial use:
weak MAC construction
possibility of forcing parties to use weaker encryption
no protection for handshakes
possibility of an attacker performing truncation attacks

PCT v1.0
Developed in 1995 by Microsoft. Privacy Communication Technology (PCT) v1.0 addressed some weaknesses of SSL v2.0, and was aimed to replace SSL. However, this protocol has never gained as much popularity as SSL v3.0.

SSL v3.0
Released in 1996 by Netscape Communications. SSL v3.0 solved most of the SSL v2.0 problems, and incorporated many of the features of PCT. Pretty quickly become the most popular protocol for securing communication over WWW.

TLS v1.0 (also known as SSL v3.1)
Published by IETF in 1999 (RFC 2246). This protocol is based on SSL v3.0 and PCT and harmonizes both Netscape’s and Microsoft’s approaches. It is important to note that although TLS is based on SSL, it is not a 100% backward compatible with its predecessor. IETF did some security improvements, such as using HMAC instead of MAC, using a different calculation of the master secret and key material, adding additional alert codes, no support for Fortezza cipher suites, and so on. The end result of these improvements is that these protocols don’t fully interoperate. Fortunately enough, TLS has also got a mode to fall back to SSL v3.0.

WTLS
“Mobile and wireless” version of the TLS protocol that uses the UDP protocol as a carrier. It is designed and optimized for the lower bandwidth and smaller processing capabilities of WAP-enabled mobile devices. WTLS was introduced with the WAP 1.1 protocol, and was released by the WAP Forum. However, after the introduction of the WAP 2.0 protocol, WTLS has been replaced by a profiled version of the TLS protocol, which is much more secure — mainly because there is no need for decryption and re-encryption of the traffic at the WAP gateway.

Why has the SSH (Secure Shell) protocol not been used for the purpose of providing secure access to World Wide Web? There are few reasons why not. First of all, from the very beginning TLS and SSL were designed for securing web (HTTP) sessions, whereas SSH was indented to replace Telnet and FTP. SSL does nothing more than handshake and establishing encryption tunnel, and at the same time SSH offers console login, secure file transfer, and support for multiple authentication schemes (including passwords, public keys, Kerberos, and more). On the other hand, SSL/TLS is based on X.509v3 certificates and PKI, which makes the distribution and management of authentication credentials much easier to perform. Hence, these and other reasons make SSL/TLS more suitable for securing WWW access and similar forms of communication, including SMTP, LDAP and others — whereas SSH is more convenient for remote system management.

To summarize, although several “secure” protocols do indeed exist, only two of them should be used for the purpose of securing web transactions (at least at the moment): TLS v1.0 and SSL v3.0. Both of them are further referred in this article series as simply SSL/TLS. Because of known weaknesses of SSL v2.0, and the famous “WAP gap” in case of WTLS, the use of these other protocols should be avoided or at least minimized.
Software requirements

This next part of the article shows how to configure Apache 2.0 with SSL/TLS support, using the mod_ssl module. Therefore, before going further, readers are encouraged to download the latest version of Apache’s 2.0 source code from Apache’s web site. Most of the examples should also work for Apache 1.3.x – in that case, however, mod_ssl need to be downloaded separately from Apache’s source code, from the mod_ssl website.

The practical examples presented in the article should work on most Linux, Linux-like and BSD-based operating systems. The only requirement for the operating system is to have both GCC and the OpenSSL library installed.

As a default web browser, MS Internet Explorer has been chosen for our testing, mainly because of ubiquitous popularity of that browser. However, any modern web browser can be used, including FireFox, Mozilla, Netscape, Safari, Opera and others).
Installing Apache with SSL/TLS support

The first step in order to install Apache with SSL/TLS support is to configure and install the Apache 2 web server, and create a user and group named “apache”. A secure way of installing Apache’s 2.0 has already been published on SecurityFocus in the article Securing Apache 2.0: Step-by-Step. The only difference to that process is to enable mod_ssl and mod_setenvif, which is required to provide compatibility with some versions of MS Internet Explorer, as follows (changes shown in bold):
./configure \  –prefix=/usr/local/apache2 \  –with-mpm=prefork \  –enable-ssl \   –disable-charset-lite \  –disable-include \  –disable-env \  –enable-setenvif \  –disable-status \  –disable-autoindex \  –disable-asis \  –disable-cgi \  –disable-negotiation \  –disable-imap \  –disable-actions \  –disable-userdir \  –disable-alias \  –disable-so

After configuring, we can install Apache into the destination directory:
make su umask 022 make install chown -R root:sys /usr/local/apache2

Configuring SSL/TLS

Before running Apache for a first time, we need also to provide an initial configuration and prepare some sample web content. As a minimum, we need to go through the following steps (as root):

Create some sample web content, which will be served up via TLS/SSL:
umask 022 mkdir /www echo “<html><head><title>Test</title></head><body> \   Test works.</body></html>” > /www/index.html chown -R root:sys /www

Replace the default Apache configuration file (normally found in /usr/local/apache2/conf/httpd.conf) with the new one, using the following content (optimized with respect to security and performance).
# ================================================= # Basic settings # ================================================= User apache Group apache ServerAdmin webmaster@www.seccure.lab     ServerName www.seccure.lab UseCanonicalName Off ServerSignature Off HostnameLookups Off ServerTokens Prod ServerRoot “/usr/local/apache2″ DocumentRoot “/www” PidFile /usr/local/apache2/logs/httpd.pid ScoreBoardFile /usr/local/apache2/logs/httpd.scoreboard <IfModule mod_dir.c>     DirectoryIndex index.html </IfModule> # ================================================= # HTTP and performance settings # ================================================= Timeout 300 KeepAlive On MaxKeepAliveRequests 100 KeepAliveTimeout 30 <IfModule prefork.c>     MinSpareServers 5     MaxSpareServers 10     StartServers 5     MaxClients 150     MaxRequestsPerChild 0 </IfModule> # ================================================= # Access control # ================================================= <Directory />     Options None     AllowOverride None     Order deny,allow     Deny from all </Directory> <Directory “/www”>     Order allow,deny     Allow from all </Directory> # ================================================= # MIME encoding # ================================================= <IfModule mod_mime.c>     TypesConfig /usr/local/apache2/conf/mime.types </IfModule> DefaultType text/plain <IfModule mod_mime.c>     AddEncoding x-compress              .Z     AddEncoding x-gzip                  .gz .tgz     AddType application/x-compress      .Z     AddType application/x-gzip          .gz .tgz     AddType application/x-tar           .tgz     AddType application/x-x509-ca-cert  .crt     AddType application/x-pkcs7-crl     .crl </IfModule> # ================================================= # Logs # ================================================= LogLevel warn LogFormat “%h %l %u %t \”%r\” %>s %b \”%{Referer}i\” \”%{User-Agent}i\”" combined LogFormat “%h %l %u %t \”%r\” %>s %b” common LogFormat “%{Referer}i -> %U” referer LogFormat “%{User-agent}i” agent ErrorLog /usr/local/apache2/logs/error_log CustomLog /usr/local/apache2/logs/access_log combined CustomLog logs/ssl_request_log \   “%t %h %{HTTPS}x %{SSL_PROTOCOL}x %{SSL_CIPHER}x \   %{SSL_CIPHER_USEKEYSIZE}x %{SSL_CLIENT_VERIFY}x \”%r\” %b” # ================================================= # SSL/TLS settings # ================================================= Listen 0.0.0.0:443 SSLEngine on SSLOptions +StrictRequire <Directory />     SSLRequireSSL </Directory> SSLProtocol -all +TLSv1 +SSLv3 SSLCipherSuite HIGH:MEDIUM:!aNULL:+SHA1:+MD5:+HIGH:+MEDIUM SSLMutex file:/usr/local/apache2/logs/ssl_mutex SSLRandomSeed startup file:/dev/urandom 1024 SSLRandomSeed connect file:/dev/urandom 1024 SSLSessionCache shm:/usr/local/apache2/logs/ssl_cache_shm SSLSessionCacheTimeout 600 SSLPassPhraseDialog builtin SSLCertificateFile /usr/local/apache2/conf/ssl.crt/server.crt SSLCertificateKeyFile /usr/local/apache2/conf/ssl.key/server.key SSLVerifyClient none SSLProxyEngine off <IfModule mime.c>     AddType application/x-x509-ca-cert      .crt     AddType application/x-pkcs7-crl         .crl </IfModule> SetEnvIf User-Agent “.*MSIE.*” \     nokeepalive ssl-unclean-shutdown \     downgrade-1.0 force-response-1.0

Note: Readers should change some of the values in the above configuration file such as the name of the web server, the administrator’s e-mail address, etc.

Prepare the directory structure for web server’s private keys, certificates and certification revocation lists (CRLs):
umask 022 mkdir /usr/local/apache2/conf/ssl.key mkdir /usr/local/apache2/conf/ssl.crt mkdir /usr/local/apache2/conf/ssl.crl

Create a self-signed server certificate (it should be used only for test purposes — your real certificate should come from a valid CA such as Verisign):
openssl req \  -new \  -x509 \  -days 30 \  -keyout /usr/local/apache2/conf/ssl.key/server.key \  -out /usr/local/apache2/conf/ssl.crt/server.crt \  -subj ‘/CN=Test-Only Certificate’

Testing the installation

At this point we can start Apache with SSL/TLS support, as follows:
/usr/local/apache2/bin/apachectl startssl Apache/2.0.52 mod_ssl/2.0.52 (Pass Phrase Dialog) Some of your private key files are encrypted for security reasons. In order to read them you have to provide us with the pass phrases. Server 127.0.0.1:443 (RSA) Enter pass phrase:************* Ok: Pass Phrase Dialog successful.

After the server starts, we can try to connect to it by pointing the web browser to the URL of the form: https://name.of.the.web.server (in our case, https://www.seccure.lab)

In few moments, we should see a warning message saying that there is problem with verifying the authentication of the web server we want to access. Below in Figure 3 we will see an example from MS Internet Explorer 6.0.

Figure 3. Anticipated IE 6 certificate warning.

The occurrence of the above warning is perfectly correct. We should receive this message because of two reasons:
The web browser does not know the Certificate Authority which issued the web server’s certificate (and cannot know, because we are using self-signed certificate)
The CN (Common Name) attribute of the certificate does not match the name of the website – at the moment it is “Test-Only Certificate”, and it should be the fully qualified domain name of the web server (e.g. www.seccure.lab)

After proceeding with Internet Explorer, we should see the following web content as shown below in Figure 4.

Figure 4. Sample working SSL web page.

As one may notice, there is a yellow lock at the bottom of the web browsers, which means that the SSL connection has been successfully established. The value “128-bit” says that the symmetric key that that is being used to encrypt the communication has the length of 128 bits, which is strong enough (at least for the moment) to protect the traffic from unauthorized access.

If we double click the lock icon, we will see the properties of website’s certificate, as shown below in Figure 5.

Figure 5. Details of our self-signed certificate.
Troubleshooting

If for some reasons we could not access the website, there is a very useful diagnostic tool known as “s_client” that comes with the OpenSSL library. It can be used to troubleshoot TLS/SSL connections. An example of how to use this tool has been shown below:
/usr/bin/openssl s_client -connect localhost:443   CONNECTED(00000003)   depth=0 /CN=Test-Only Certificate   verify error:num=18:self signed certificate   verify return:1   depth=0 /CN=Test-Only Certificate   verify return:1   —   Certificate chain    0 s:/CN=Test-Only Certificate      i:/CN=Test-Only Certificate   —   Server certificate   —–BEGIN CERTIFICATE—–   MIICLzCCAZigAwIBAgIBADANBgkqhkiG9w0BAQQFADAgMR4wHAYDVQQDExVUZXN0   LU9ubHkgQ2VydGlmaWNhdGUwHhcNMDQxMTIyMTg0ODUxWhcNMDQxMjIyMTg0ODUx   WjAgMR4wHAYDVQQDExVUZXN0LU9ubHkgQ2VydGlmaWNhdGUwgZ8wDQYJKoZIhvcN   AQEBBQADgY0AMIGJAoGBAMEttnihJ7JpksdToPi5ZVGcssUbHn/G+4G43OiLhP0i   KvYuqNxBkSqqM1AanR0BFVEtVCSuq8KS9LLRdQLJ/B1UTMOGz1Pb14WGsVJS+38D   LdLEFaCyfkjNKnUgeKMyzsdhZ52pF9febB+d8cLmvXFve28sTIxLCUK7l4rjT3Xl   AgMBAAGjeTB3MB0GA1UdDgQWBBQ50isUEV6uFPZ0L4RbRm41+i1CpTBIBgNVHSME   QTA/gBQ50isUEV6uFPZ0L4RbRm41+i1CpaEkpCIwIDEeMBwGA1UEAxMVVGVzdC1P   bmx5IENlcnRpZmljYXRlggEAMAwGA1UdEwQFMAMBAf8wDQYJKoZIhvcNAQEEBQAD   gYEAThyofbK3hg8AJXbAUD6w6+mz6dwsBmcTWLvYtLQUh86B0zWnVxzSLDmwgdUB   NxfJ7yfo0PkqNnjHfvnb5W07GcfGgLx5/U3iUROObYlwKlr6tQzMoysNQ/YtN3pp   52sGsqaOOWpYlAGOaM8j57Nv/eXogQnDRT0txXqoVEbunmM=   —–END CERTIFICATE—–   subject=/CN=Test-Only Certificate   issuer=/CN=Test-Only Certificate   —   No client certificate CA names sent   —   SSL handshake has read 1143 bytes and written 362 bytes   —   New, TLSv1/SSLv3, Cipher is DHE-RSA-AES256-SHA   Server public key is 1024 bit   SSL-Session:       Protocol  : SSLv3       Cipher    : DHE-RSA-AES256-SHA       Session-ID: 56EA68A5750511917CC42A1B134A8F218C27C9C0241C35C53977A2A8BBB9986A       Session-ID-ctx:       Master-Key: 303B60D625B020280F5F346AB00F8A61A7C4BEA707DFA0ED8D2F52371F8C4F087FB6EFFC02CE3B48F912D2C8929DB5BE       Key-Arg   : None       Start Time: 1101164382       Timeout   : 300 (sec)       Verify return code: 18 (self signed certificate)   —   GET / HTTP/1.0   HTTP/1.1 200 OK   Date: Mon, 22 Nov 2004 22:59:56 GMT   Server: Apache   Last-Modified: Mon, 22 Nov 2004 17:24:56 GMT   ETag: “5c911-46-229c0a00″   Accept-Ranges: bytes   Content-Length: 70   Connection: close   Content-Type: text/html   <html><head><title>Test</title></head><body>Test works.</body></html>   closed

The s_client tool has many useful options, such as switching on/off a particular protocol (-ssl2, -ssl3, -tls1), choosing a certain cipher suite (-cipher), enabling debug mode (-debug), watching SSL/TLS states and messages (-state, -msg), and some other options which can help us find the source of the problems.

If s_client does not lead us to the source of problem, we should change LogLevel value (in httpd.conf) to “debug”, then restart Apache and check its log files (/usr/local/apache2/logs/) for more information.

We can also try to use Ethereal or ssldump. Thanks to these tools, we can passively watch the SSL Handshake messages, and try to find the reason for the failure. A screenshot of doing this using Ethereal is shown below in Figure 6.

Figure 6. Ethereal watching SSL Handshake methods.
Concluding part one

With our secure Apache 2 server up and running with SSL and a sample certificate, this concludes part one of the article series. Next in part two, the reader will see the recommended security and performance settings for mod_ssl, as well as the process for creating a valid web server certificate.

PHP优化


最近在公司一边自学一边写PHP程序,由于公司对程序的运行效率要求很高,而自己又是个新手,一开始就注意程序的效率很重要,这里就结合网上的一些资料,总结下php程序效率优化的一些策略:

1.在可以用file_get_contents替代file、fopen、feof、fgets等系列方法的情况下,尽量用file_get_contents,因为他的效率高得多!但是要注意file_get_contents在打开一个URL文件时候的PHP版本问题;(对这于这一点kimi不敢苟同,详细请查阅http://www.ccvita.com/index.php/163.html)

2.尽量的少进行文件操作,虽然PHP的文件操作效率也不低的;

3.优化Select SQL语句,在可能的情况下尽量少的进行Insert、Update操作(在update上,我被恶批过);

4.尽可能的使用PHP内部函数(但是我却为了找个PHP里面不存在的函数,浪费了本可以写出一个自定义函数的时间,经验问题啊!);

5.循环内部不要声明变量,尤其是大变量:对象(这好像不只是PHP里面要注意的问题吧?);

6.多维数组尽量不要循环嵌套赋值;

7.在可以用PHP内部字符串操作函数的情况下,不要用正则表达式;

8.foreach效率更高,尽量用foreach代替while和for循环;

9.用单引号替代双引号引用字符串;

10.“用i+=1代替i=i+1。符合c/c++的习惯,效率还高”;

11.对global变量,应该用完就unset()掉

PHP写的FTP类

class FtpGet {
    private $config;
    private $retry = 15;
    private $errors = array();
    
    function __construct($config = array()) {
        if (empty ( $config )) {
            $this->config = array (
                 ‘path_local’ => ‘/local/path’,
                 ‘path_remote’ => ‘/remote/path’,
                 ‘server’ => ‘ftp.server.com’,
                 ‘username’ => ‘username’,
                 ‘password’ => ‘password’
            );
        } else {
            $this->config = $config;
        }
    }
    
    public function getFile($filename) {
        if (empty ( $filename ))
            return “”;
        $this->errors = array();
        $return_file = “”;
        $i = 0;
        while ( $i < $this->retry ) {
            try {
                //try to connect
                $conn = ftp_connect ( $this->config ['server'], 21 );
                if (! $conn) {
                    $this->errors['connect'] = “can not connect to the ftp server ” . $this->config ['server'];
                    sleep ( 30 );
                    $i ++;
                    continue;
                }
                
                //try to login
                $lr = ftp_login ( $conn, $this->config ['username'], $this->config ['password'] );
                if (! $lr) {
                    $this->errors['login'] =  “can not login to the ftp server ” . $this->config ['server'];
                    sleep ( 30 );
                    $i ++;
                    continue;
                }
                
                //check local path    
                if (! $this->checkPath ( $this->config ['path_local'] )) {
                    $this->errors['path'] =  “can not create local path {$this->config ['path_local']}, please chmod the dir to 777.”;
                    break;
                }
                
                //set to pasv mode
                if(!ftp_pasv ( $conn, true )){
                    $this->errors['pasv'] =  “can not set transfer mode to pasv.”;
                    break;
                }

                //get remote files list
                $fileList = ftp_nlist ( $conn, $this->config ['path_remote'] );
                if (count ( $fileList ) === 0) {
                    $this->errors['nofile'] =  “there is not files on the ftp server ” . $this->config ['server'];
                    sleep ( 600 );
                    $i ++;
                    continue;
                }
                
                //loop to check if matched the file
                foreach ( $fileList as $single_file ) {
                    if ((preg_match ( “/^(.*?)$filename\$/i”, $single_file )) && ftp_size ( $conn, $single_file ) > 0) {
                        $local_file = $this->config ['path_local'] . “/” . basename ( $filename );
                        $rs = ftp_get ( $conn, $local_file, $single_file, FTP_BINARY );
                        if ($rs) {
                            $return_file = $single_file;
                        } else {
                            $this->errors['nofile'] = “can not get the file ” . $single_file;
                        }
                        break;
                &nbsp
;   }
                }
                
                //close ftp connection
                $this->errors['close'] = ftp_close ( $conn );
            
            } catch ( Exception $e ) {
                $this->errors['exception'] = “found error: {$e->getMessage()}\n”;
                sleep(5);
            }
            if (! empty ( $return_file )) {
                break;
            }
            
            $i ++;
        }
        
        if (empty($return_file) && count($this->errors) > 0) {
            $title = “[Error] Could not get file {$filename} from ftp server {$this->config['server']}”;
            $message = implode(“\n”, $this->errors);
            $this->sendMail ( $title, $message );
        }
        return $return_file;
    }
    private function checkPath($Path) {
        if (! is_dir ( $Path ) || ! file_exists ( $Path )) {
            if (mkdir ( $Path, 0777 )) {
                return 1;
            } else {
                return 0;
            }
        } else {
            return 1;
        }
    }
   
    private function sendMail($title, $msg) {
        $boundary = uniqid ();
        $mail_header = “From: abc@from.com\n”;
        $mail_header .= “Content-type: multipart/related; boundary=\”{$boundary}\”\n”;
       
        $mail_body = “\n–{$boundary}\n”;
        $mail_body .= “Content-type: text/plain; charset=\”iso-8859-1\” \n”;
        $mail_body .= “Content-transfer-encoding: 7bit \n\n”;
        $mail_body .= “This is a auto mail from process download file from ftp server.\n”;
        $mail_body .= “{$msg}\n\n”;
       
        $mail_body .= “–{$boundary}–\n”;
       
        $mail_to = “haha@email.com”;
        if (mail ( $mail_to, $title, $mail_body, $mail_header )) {
            echo “Send alert email successfully ! \n”;
        }
    }
}

docx、xlsx 这种 office 2007 格式 设置 MIME

以前在找了 MIME 的一些资料,只显示 doc、xls 等 office 2002 之类的格式有 MIME

如 .doc MIME 就设置成 application/msword 就可以了。在网页里面docx 文件是没问题,但是下载下来了之后就变成doc格式了,虽然打开是没问题,但总感觉有点不爽。今天我又搜索了一些。搜到了:

原文网址

.dotx,application/vnd.openxmlformats-officedocument.wordprocessingml.template
.docx,application/vnd.openxmlformats-officedocument.wordprocessingml.document
.xlsx,application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
.pptx,application/vnd.openxmlformats-officedocument.presentationml.presentation

.doc,application/msword

.dot,application/msword

.xls,application/vnd.ms-excel

[转]php获得远程文件大小的函数

文件的大小函数为:filesize()
文件是否存在的函数为:file_exits();
但是这两个函数只针对本地

那么:远程文件是否存在,远程文件大小 如何得知呢?

搜索了一下,有人居然说,把远程文件下载过来再判断这个远程文件的大小,这是什么歪理。

庆幸大部分人还是清醒的,一般应该使用判断header反馈的信息进行判断。

php中如何获得header信息呢? php的函数真多,这个也不例外

1.最简单的获取远程文件大小办法

$a_array = get_headers(url,true);
url就是网址了,至于第二个参数

就可以得到类似下面的这个数组

Array
(
[0] => HTTP/1.1 200 OK
[Date] => Sat, 29 May 2004 12:28:14 GMT
[Server] => Apache/1.3.27 (Unix) (Red-Hat/Linux)
[Last-Modified] => Wed, 08 Jan 2003 23:11:55 GMT
[ETag] => "3f80f-1b6-3e1cb03b"
[Accept-Ranges] => bytes
[Content-Length] => 438
[Connection] => close
[Content-Type] => text/html
)

所以,你可以很舒服的拿到远程文件的大小
$file_sizeofurl = a_array['Content-Length'];
2.用curl获取远程文件大小
如果服务器禁止get_headers 怎么办?
换一种办法,用curl
我总觉得curl就像一个虚拟的用户,什么都能模仿

下面直接给出一个老外的函数
请注意
echo '
head-->'.$head.'<----end
';

这句是我加的,为了知道header里面到底包含了什么东西
function remote_filesize($uri,$user='',$pw='')
{
// start output buffering
ob_start();
// initialize curl with given uri
$ch = curl_init($uri);
// make sure we get the header
curl_setopt($ch, CURLOPT_HEADER, 1);
// make it a http HEAD request
curl_setopt($ch, CURLOPT_NOBODY, 1);
// if auth is needed, do it here
if (!empty($user) && !empty($pw))
{
$headers = array('Authorization: Basic ' . base64_encode($user.':'.$pw));
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
}
$okay = curl_exec($ch);
curl_close($ch);
// get the output buffer
$head = ob_get_contents();
// clean the output buffer and return to previous
// buffer settings
ob_end_clean();

echo ‘
head–>’.$head.’<—-end
‘;

// gets you the numeric value from the Content-Length
// field in the http header
$regex = ‘/Content-Length:s([0-9].+?)s/’;
$count = preg_match($regex, $head, $matches);

// if there was a Content-Length field, its value
// will now be in $matches[1]
if (isset($matches[1]))
{
$size = $matches[1];
}
else
{
$size = 'unknown';
}
//$last=round($size/(1024*1024),3);
//return $last.' MB';
return $size;
}

3.fsock获取远程文件大小的办法
先给函数
function getFileSize($url){
$url = parse_url($url);
if($fp = @fsockopen($url['host'],empty($url['port'])?80:$url['port'],$error)){
fputs($fp,"GET ".(empty($url['path'])?'/':$url['path'])." HTTP/1.1rn");
fputs($fp,"Host:$url[host]rnrn");
while(!feof($fp)){
$tmp = fgets($fp);
if(trim($tmp) == ''){
break;
}else if(preg_match('/Content-Length:(.*)/si',$tmp,$arr)){
return trim($arr[1]);
}
}
return null;
}else{
return null;
}
}

哪个获取远程文件大小最快?
针对同一个url进行测试,curl > fsock > getheader
针对不同url测试,结果还是 curl > fsock > getheader
当然也许这个测试是不准确的,但getheader函数是明显要慢一些的

考虑到curl模块没有fsock那么普及,所以我自己还是用后面一个

速度上的差别大约是 curl比fsock快0.2秒,fsock比getheader快0.2秒。

远程文件的大小拿来干嘛用?
好像某些人用来分块下载文件
我是拿来判断远程文件是否更新(虽然不准),其他人有什么好办法不?给我留言吧