Get HTML content from another site

Active3 hr before
Viewed126 times

6 Answers


Because of cross-domain security issues, you won't be able to do this client-side, unless you're content with an iframe., Is it fallacious to argue that something is correct, of good quality, or acceptable because a community of experts has established it as such? ,Either way, to start things off on the client side you'll issue a standard AJAX request to your own server:,With PHP, you can use several methods of "scraping" the content. The approach you use depends on whether you need to use cookies in your requests (i.e. the data is behind a login).

Either way, to start things off on the client side you'll issue a standard AJAX request to your own server:

   type: "POST",
   url: "localProxy.php",
   data: {
      url: "maybe_send_your_url_here.php?product_id=1"
}).done(function(html) {
   // do something with your HTML!
load more v

A very simple demo will display below the content of another HTML page anotherpage.html. ,There is no responseHTML attribute belong responseText and responseXML in XMLHttpRequest, but this is not a problem. If we want to get data from another HTML page and insert it into the displayed page, this can be achieved easily. The responseXML attribute holds an XML document DOM can access, and to provide the equivalent for HTML we need just for a <div> tag and a JavaScript function that extends Ajax or the framework we are using. ,To put the data into the page and make it visible to readers, the content is stored into this <div> tag, with the "displayed" id (or the id of your choice), with the following statement:,To store the other document inside the page, the following tag has to be included anywhere into the page:

This may be a variable of type Element.

var responseHTML = document.createElement("body");
load more v

Tip: Links can of course be styled with CSS, to get another look!,A local link (a link to a page within the same website) is specified with a relative URL (without the "https://www" part):,Link to a page located in the html folder on the current web site: ,Link to a page located in the same folder as the current page: 

HTML Links - Syntax

The HTML <a> tag defines a hyperlink. It has the following syntax:

load more v

The .contents() method can also be used to get the content document of an iframe, if the iframe is on the same domain as the main page.,We can employ the .contents() method to help convert this blob of text into three well-formed paragraphs:,Description: Get the children of each element in the set of matched elements, including text and comment nodes., version added: 1.2.contents() This method does not accept any arguments.

<div class="container"> Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. <br><br> Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. <br><br> Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.</div>
load more v

I use cURL then parse the returned code to extract the data your looking for into an array, from there your script can do with it as it wishes.,It is advised to at all times configure PHP to not allow external file reads except through the cURL library. If the cURL library isn’t needed (which is often the case) it should be turned off as well.,The fact that the very first example for the function in the manual shows how to get the contents of an external URL would seem to disprove your point.,Anytime you are playing with 3rd party information you have to take precautions, but your statement above is a massive exaggeration. Want a ultra-secure server? Don’t connect it to the internet.

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, '');
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$data = curl_exec();
curl_close($ch); ?
load more v

From the inside of a frame, a script can get a reference to its parent window with window.parent.,Report a content issue 🌐,origin: The sent referrer will be limited to the origin of the referring page: its scheme, host, and port.,Inline frames, like <frame> elements, are included in the window.frames pseudo-array.

<iframe src="" title="iframe Example 1" width="400" height="300">
load more v

Other "content-undefined" queries related to "Get HTML content from another site"