Hello,
I'm using a WebClient to download and parse a webpage to get all the images that are in <img.../> tags.
When the images paths are relatives (ex: 'images/hello.png'), and the url of the webpage contains url rewriting like "http://example.org/454/history-friday" i can't retrieve the real url to get the image.
If i have for example this url: "http://example.org/454/history"
and i get this image relative path: "images/hello.png"
how can i get the real path to download the image?
The real url of the page can be "http://example.org/index.php?cat_id=454&type=history" the url should be "http://example.org/images/hello.png"
or the page can be "http://example.org/454/index.php?type=history" the url should be "http://example.org/454/images/hello.png"....
I tried to use BaseAdresse property from WebClient but it still empty...
Thx for your help.
wat? real url is what your top line of text in browser shows o_O never met relative paths to something, I always get full paths
(08-24-2013, 11:58)Arteq Wrote: [ -> ]wat? real url is what your top line of text in browser shows o_O never met relative paths to something, I always get full paths
This answer is pretty retarded. Did you even read the thread?
@
narkos, I have no idea. Sorry.
I don't want to get the url of the page....
I want to get the url to download the images...
With this informations:
page url: "http://example.org/454/history/september"
image path: "images/hello.png"
How can i get the url to download the image?
Thx
(08-24-2013, 12:11)narkos Wrote: [ -> ]I don't want to get the url of the page....
I want to get the url to download the images...
With this informations:
page url: "http://example.org/454/history/september"
image path: "images/hello.png"
How can i get the url to download the image?
Thx
Download a image with your browser. Then see it's address. Try to figure to addressess of images via it.
(08-24-2013, 12:07)Pozzuh Wrote: [ -> ] (08-24-2013, 11:58)Arteq Wrote: [ -> ]wat? real url is what your top line of text in browser shows o_O never met relative paths to something, I always get full paths
This answer is pretty retarded. Did you even read the thread?
@narkos, I have no idea. Sorry.
Typical @
Arteq retardness. We DO NOT NEED STUPID RETARD MODERATORS LIKE THIS ONE.
There is SVN downloader at codeplex. It shows you links to all files. It's freeware and source code released, google it "Svn downloader codeplex". Also it's C-Sharp and VB.
@
Bandarigoda123 i don't understand your solution
@
d0h! thx but my script already found the src of the images in the pages it scans
I need to find automatically the url of the page that is executed when i browse a link like "http://example.org/454/history/september", for example the executed script is maybe stored at
http://example.org/454/showpage.php, if i found that, and i found an image with the relative path "images/hello.png", i can create the image link like this:
http://example.org/454/images/hello.png
Someone understand what i mean?
Thank you!
(08-24-2013, 17:23)narkos Wrote: [ -> ]@Bandarigoda123 i don't understand your solution
@d0h! thx but my script already found the src of the images in the pages it scans
I need to find automatically the url of the page that is executed when i browse a link like "http://example.org/454/history/september", for example the executed script is maybe stored at http://example.org/454/showpage.php, if i found that, and i found an image with the relative path "images/hello.png", i can create the image link like this: http://example.org/454/images/hello.png
Someone understand what i mean?
Thank you!
http://downloadsvn.codeplex.com/
Check this out. It finds for all paths when you type address.
But if you want to access .php, i think it isn't possible.
This can be solved with splitting but I don't know if that's the best possible way
CSHARP Code
public static string GetAbsoluteUrlFromRelative(string relativeUrl, string webpage)
{
string[] parts = webpage.Split('/');
if (relativeUrl.StartsWith("/"))
{
string newUrl = parts[0] + "/" + "/" + parts[2] + relativeUrl;
return newUrl;
}
else
{
string newUrl = "";
parts.ToList().ForEach(x =>
{
if (x != parts[parts.Length - 1])
{
newUrl = newUrl + x + "/";
}
});
newUrl = newUrl + relativeUrl;
return newUrl;
}
}