Tự động chuyển trang hoặc loại gạch chéo đôi (//) trong URL bằng .htaccess

Newsun

Believe in Good
Thành viên thân thiết
Tham gia
20/4/2008
Bài viết
9.433
Google search engine spider or crawler Googlebot is one of the most advanced and finest content grabber that extracts most if not all of any data of the Internet and World Wide Web, regardless of whether such contents or web pages are intended for public visitors.

As such, some webmasters may notice that Google is crawling and indexing web pages which are not supposed to be existed in the first place. One such instance is incorrect and wrong URL link location, such as when double slash, triple slash or multiple forward slashes been formed as part of the URL as extra slash or slashes been appended or added to the web page URLs. For example, a page with URL of https://www.domain.com/index.php is also been crawled by Google as https://www.domain.com//index.php, or sometimes even https://www.domain.com///index.php.

The web crawl and spider indexing by Google can be tracked and traced in Apache or web server access logs. And these double slash, triple slash, quadruple or more slash URLs can show up in the Google search results, and potentially can cause duplicate content issue , and worse, penalty that makes the site vanishing from Google search results listing or been pushed into supplemental results with low ranking.

The reason for additional slashes in the URL is unknown, and seems like it only happens on Google search engne. One possibility is visitors or other websites type or put incorrect backlink (external link), which picked up by ultra-sensitive Googlebot or Mediapartners-Google (Google AdSense crawler which also contribute website indexing of Google Search). But one thing is for sure – when one page with wrongful double or a few adjacent slashes been indexed, entire web pages within the website may be prone to such error too.

To fix and resolve the double slash or multiple slash URLs issue is pretty simple. Simply create a mod_rewrite directive in .htaccess code or Apache httpd.conf configuration file to rewrite or redirect permanently (status 301) all URL found to be contained two or more adjacent or trailing slashes to its proper and valid URL with just a single slash (/) instead or // or /// or even //// or more within the URL address, create a .htaccess file in the root directory (normally public_html) for the website with the following URL rewriting and redirection directive (if .htaccess already exists, just add in the code on top).

# Remove multiple slashes anywhere in URL
RewriteCond %{REQUEST_URI} ^(.*)//(.*)$
RewriteRule . %1/%2 [R=301,L]


The above redirection or URL rewrite method will parse the complete URL section on the part after the domain name, and will change each part of double slash to single slash (and thus able to handle more than 2 slashes in URL too). For administrators who want to more efficient rewrite command, use the following code, which can only be used on .htaccess file.

# Remove multiple slashes after domain
RewriteRule ^/(.*)$ https://www.domain.com/$1 [R=301,L]


Replace domain name accordingly.

 
Hiệu chỉnh:
×
Quay lại
Top