To use this website completely, it is necessary to store cookies on your computer.
 

* Navigator

Expand - Collapse

* Statistic

  • *Total Posts: 14942
  • *Total Topics: 2271
  • *Online Today: 12
  • *Most Online: 292
(2016 November 12, 09:37:31 am)

SiteMap slow if SEF enabled

Started by feline, 2011 July 28, 12:16:31 pm

previous topic - next topic

0 Members and 0 Guests are viewing this topic.

feline

2011 July 28, 12:16:31 pm Last Edit: 2011 September 26, 11:53:39 am by feline
If you have the SiteMap modification installed and the PortaMx SEF engine is enabled, the SiteMap xml list creation is very slow. We have make a fix for this, that we posted here...

1.. Download the attached file sitemap_patch.zip, extract it and copy the files to your server (note the Folder structure).
2.. Change the Sitemap setting (Admin - Configuration - Modification Settings - Sitemap) and set Time that XML data should be cached (seconds) to 7500

Additional you can create a scheduled Task, they create a new Sitemap all 2 hours and story this in the cache.
If you will do that you have to enable the cache in SMF and must make follow modificatons:
Open the file Sources/ScheduledTasks.php in a editor
Find:
Code Select
// Finally, send some stuff...
header('Expires: Mon, 26 Jul 1997 05:00:00 GMT');
header('Last-Modified: ' . gmdate('D, d M Y H:i:s') . ' GMT');
header('Content-Type: image/gif');
die("\x47\x49\x46\x38\x39\x61\x01\x00\x01\x00\x80\x00\x00\x00\x00\x00\x00\x00\x00\x21\xF9\x04\x01\x00\x00\x00\x00\x2C\x00\x00\x00\x00\x01\x00\x01\x00\x00\x02\x02\x44\x01\x00\x3B");
}


add after and save the file:
Code Select
// create SiteMap XML data
function scheduled_SiteMapCacheXML()
{
global $context, $scripturl, $settings, $txt, $user_info, $modSettings, $smcFunc, $sourcedir, $mbname;

include_once($sourcedir .'/Sitemap.php');

// Load integration info...
if (($context['sitemap']['extensions'] = cache_get_data('sitemap_extensions', 3600)) == null)
{
$ext_dir = $sourcedir . '/Sitemap-Ext';
$context['sitemap']['extensions'] = array();
if (is_readable($ext_dir))
{
$dh = opendir($ext_dir);
while ($filename = readdir($dh))
{
// Skip these
if (in_array($filename, array('.', '..')) || preg_match('~^sitemap_([a-zA-Z_-]+)\.php~', $filename, $match) == 0)
continue;

if (@include_once($ext_dir . '/' . $filename))
$context['sitemap']['extensions'][$match[1]] = array($filename, 'has_display' => function_exists(ucwords($match[1]) . 'Display'), 'has_xml' => function_exists(ucwords($match[1]) . 'XML'));
}
}

cache_put_data('sitemap_extensions', $context['sitemap']['extensions'], 3600);
}

require_once($sourcedir . '/News.php');

// Setup the main forum url...
$context['sitemap']['main'] = array('time' => date_iso8601());

// Fixup the query_see_board so it only displays what guests would see
$old_see_board = $user_info['query_see_board'];
$user_info['query_see_board'] = '(FIND_IN_SET(-1, b.member_groups))';

$context['sitemap']['items'] = array();

// Extensions?
foreach ($context['sitemap']['extensions'] as $ext => $info)
{
if (!$info['has_xml'])
continue;

include_once($sourcedir . '/Sitemap-Ext/' . $info[0]);
$xmlFunction = ucwords($ext) . 'XML';
$context['sitemap']['items'] = array_merge($context['sitemap']['items'], $xmlFunction(true));
}

$user_info['query_see_board'] = $old_see_board;

// All went well!
return true;
}


Open the file /Themes/default/languages/Modifications.englisg.php in a editor.
Find the Sitemap strings, add follow then save.
Code Select
$txt['scheduled_task_desc_SiteMapCacheXML'] = 'Create a new xml Sitemap and put these to the SMF Cache.';
$txt['scheduled_task_SiteMapCacheXML'] = 'Sitemap create XML';


Now you must insert the task in the database.
For this you have to use MySQLAdmin or a other Database tool.
Insert follow:
id_task                 leave it empty
next_time            0
time_offset          0
time_regularity    2   
time_unit             h
disabled               0
task                     SiteMapCacheXML

Finally go the Admin - Maintenance - Scheduled Tasks
Find the new entry and check Run Now on it. Then click on the button Run Now.
If all works, up from now all 2 hours a new sitemap xml is created and cached.


[attachment deleted by admin]
Many are stubborn in relation to the path, a few in relation to the target.

hartiberlin

#1
2011 July 28, 10:54:29 pm
Thanks for the info.
For which Sitemap Mod version does this work ?

Could you also please release your own Sitemap Mod maybe in the future ?

SlimmedDime still has not updated his version sitemap_2-2-1.zip for SMF 2.0 Final, so
it probably still has bugs ... ?!

I don´t trust his Mods not anymore...
Why does he call himself a MOD author when he does
not update his Mods, when the final versions are out ???

Many thanks.

Regards, Stefan.

feline

#2
2011 July 28, 11:07:23 pm
Quote from: hartiberlin ,  2011 July 28, 10:54:29 pm
Thanks for the info.
For which Sitemap Mod version does this work ?

it's for Sitemap 2.2.1 (zip filename) ...

Quote from: hartiberlin
Could you also please release your own Sitemap Mod maybe in the future ?

No, no .. we can't make a fork for all things.
SiteMap 2.2.1 works with 2.0 .. simple emulate a "SMF 2.0 RC4"  ...
Many are stubborn in relation to the path, a few in relation to the target.

hartiberlin

#3
2011 July 29, 01:13:09 am
Does one need this patch also in the upcoming PortaMX1.4 version ?

Many thanks.

feline

#4
2011 July 29, 09:50:47 am
Anyone they use Sitemap and PortaMx with SEF enabled CAN apply the patch.
The patch use SEF internal function to create the links in the xml list faster.
The patch works up from PortaMx 1.2 ...
Many are stubborn in relation to the path, a few in relation to the target.

MiY4Gi

#5
2011 August 01, 10:35:52 pm
I made Time that XML data should be cached (seconds) => 86400 (i.e. 24 hours).

Will that cause problems with the patch?

Check out my new anime club. Discuss and share anime, or just fool around with other forumites. Join us at MyAnimeClub.

feline

#6
2011 August 01, 11:25:07 pm
No .. but if you create a sheduled task, you have to set the time for this a little bit less as the caching time ...
Many are stubborn in relation to the path, a few in relation to the target.

MiY4Gi

#7
2011 September 14, 09:00:30 pm
I've got a slight problem. All my forum's URL's end with a forward slash "/", but the sitemap displays the URL without this forward slash. I know that the default SMF install doesn't have this forward slash so it's probably a problem caused by my portal. Where does the sitemap get the URL's? Both URL's work, with or without the slash, but I only want the sitemap to show the one with the slash. What code do I need to change to do this?
Check out my new anime club. Discuss and share anime, or just fool around with other forumites. Join us at MyAnimeClub.

feline

#8
2011 September 15, 12:33:11 pm
I have looked on you site and the sitemap url's looks correct (with a / at the end)
Many are stubborn in relation to the path, a few in relation to the target.

MiY4Gi

#9
2011 September 25, 09:38:38 pm
Nope, the sitemap urls don't have a slash. I need that slash, since Google indexes both URLs, the one with the slash and the one without.

I know Google eventually removes the one without the slash since its canonical URL points to the page with the slash, but there still a period where both URLs appear in the Google index.

I have 2 choices. Either I remove the slash from the main URL's by editing the PortaMx code, or I add a slash to the URL's in the sitemap, by editing the Sitemap code. Since I rather like the slashes, and it's less work, I'd prefer to only change the Sitemap code.

[attachment deleted by admin]
Check out my new anime club. Discuss and share anime, or just fool around with other forumites. Join us at MyAnimeClub.

feline

#10
2011 September 26, 11:39:59 am
If you use the PortaMx SEF and add the patch for Sitemap they I post above all works was well.
Many are stubborn in relation to the path, a few in relation to the target.

MiY4Gi

#11
2011 September 26, 05:46:31 pm
Quote from: feline ,  2011 September 26, 11:39:59 am
If you use the PortaMx SEF and add the patch for Sitemap they I post above all works was well.


Did you change the patch in your opening post?
Check out my new anime club. Discuss and share anime, or just fool around with other forumites. Join us at MyAnimeClub.

feline

#12
2011 September 26, 07:32:56 pm
Yes .. a little bit fixed
Many are stubborn in relation to the path, a few in relation to the target.

hartiberlin

#13
2011 November 11, 05:43:07 am


I just see, this SlimmedDime has not updated his MOD after 11 Months ! >:(



I think such programmers should be kicked out as MOD programmers,
if they don´t update their MODs after 1 month of the final versions !

What a lame programmer ! :o

Must this fix above from Feline still be applied in the 1.45 PortaMX version ?

Many thanks.

feline

#14
2011 November 11, 03:31:06 pm
The fix have nothing to do with the PortaMx version ...
Many are stubborn in relation to the path, a few in relation to the target.

hartiberlin

#15
2011 November 11, 03:57:39 pm
Okay, muss ich also den Fix machen, ja ?

Habe es im Augenblick ohne diesen Fix laufen...
was passiert dann ?

Danke.

feline

#16
2011 November 11, 04:01:16 pm
Muss man nicht machen ... zusammen mit dem scheduled Task kann man die xml liste schneller aufbauen
Many are stubborn in relation to the path, a few in relation to the target.

hartiberlin

#17
2011 November 12, 05:15:36 am
Okay, wo liegt jetzt eigentlich , wenn man SEF enabled hat,
das sitemap file ?

Bei:
domain.com/sitemap/
oder
domain.com/sitemap.xml

??

Also was muss ich bei den Google Webmastertools einstellen ?

Danke.

feline

#18
2011 November 12, 10:28:44 am
Quote from: hartiberlin ,  2011 November 12, 05:15:36 am
Okay, wo liegt jetzt eigentlich , wenn man SEF enabled hat,
das sitemap file ?
Also was muss ich bei den Google Webmastertools einstellen ?

Die SEF-Url ist domain.tld/sitemap/xml/
Many are stubborn in relation to the path, a few in relation to the target.

hartiberlin

#19
2011 November 15, 12:46:48 am
Hmm,
Webmastertools findet nichts:

http://www.overunity.com/sitemap/xml/

Woran kann das liegen ?

Hatte nur den Patch
sitemap_patch2.zip
eingespielt und auf 7500 gestellt,
den Cron-Job habe ich aber noch nicht gemacht...

Muss ich die Veränderungen für den Cronjob dann auch zwingend machen ?

Danke.

hartiberlin

#20
2011 November 15, 12:52:54 am
P.S: Kann es sein, dass durch das SEF auch Google Webmastertools nicht mehr auf das
robots.txt
zugreifen kann ?

feline

#21
2011 November 15, 01:37:57 pm
Wenn ich auf die url http://www.overunity.com zugreife bekomme ich nur

Connection Problems
Sorry, SMF was unable to connect to the database. This may be caused by the server being busy. Please try again later.

Du kannst den Link doch einfach testen ... da siehst du die xml liste. Und wenn das klappt, gehts auch mit Google.
Und mit der robots.txt hat das nichts zu tun, ich habe z.B. folgende:
User-agent: *
Disallow: /attachments/
Disallow: /avatars/
Disallow: /cache/
Disallow: /editor_uploads/
Disallow: /fckeditor/
Disallow: /Packages/
Disallow: /Smileys/
Disallow: /Sources/
Disallow: /Themes/
Many are stubborn in relation to the path, a few in relation to the target.

hartiberlin

#22
2011 November 15, 07:58:44 pm
So,
jetzt geht es wieder...
weiss auch nicht woran es lag.

unter:

www.overunity.com/sitemap/xml
kam gar nichts..
Habe dann nochmal das Sitemap Plugin deinstalliert und
dann nochmal installiert und dann den Patch vom Anfang dieses Threads nochmal,
kam immer noch nichts...

Dann mal den ersten Wert von 20.000 Threads auf 10.000 Thread geaändert
und abgespeichert und siehe da, es kam was !

Dann wieder auf 20.000 zurückgestellt und jetzt geht es...
Müssen wohl irgendwelche falschen Parameter vorher in der Datenbank gewesen sein....

So jetzt werde ich nochmal diese CronJob Modifikation machen, damit er auch alle 2 Stunden
die Sitemap neu macht und es dann bei Google Webmastertools eintragen.

Danke.

Gruss, Stefan.

hartiberlin

#23
2011 November 16, 01:42:11 am
Hmm,
eben wurde wieder nichts angezeigt bei:
www.overunity.com/sitemap/xml


Dann habe ich die Anzahl der Threads mal auf 1000
runtergestellt, abgespeichert und damm wieder
auf 20.000 hochgestellt...

Habe aber noch nicht den Cronjob gmeacht gehabt, nur den Patch2
hochgespielt gehabt...


Bei Google webmastertools zeigt er nun aber nur an:

Submitted URLs
1,087
0 URLs in web index


Hmm, habe aber bestimmt mehr als 1087 threads...

Da kann noch was nicht stimmen...

feline

#24
2011 November 16, 10:48:38 am
Du solltest erst mal mit geringerer zahl anfangen und schauen wie lange der aufbau der xml seite dauert.
Wenn das länger wird als die maximale laufzeit eines php scripts bricht der server ab .... dann kommt nix
Many are stubborn in relation to the path, a few in relation to the target.