Size: a a a

2020 September 05

S

SoHard 🎄 in Scrapy
а работал бы со скрапи а не на велосипеде то видел бы редиректы
источник

S

SoHard 🎄 in Scrapy
ну или на крайняк если бы логировал что-то
источник

СТ

Семён Трояновский... in Scrapy
ну звучит логично и все шансы что так и есть, но как знать как знать
источник

СТ

Семён Трояновский... in Scrapy
я просто исхожу из того что между злым умыслом (сайт намерено и хитро вставляет дубли линков) и некомпетенцией (чел не сумел правильно итерироваться) - я выбираю второе
источник

S

SoHard 🎄 in Scrapy
Семён Трояновский
я просто исхожу из того что между злым умыслом (сайт намерено и хитро вставляет дубли линков) и некомпетенцией (чел не сумел правильно итерироваться) - я выбираю второе
еще может быть реклама на каждой странице
источник

S

SoHard 🎄 in Scrapy
в которую он каждый раз заходит
источник

СТ

Семён Трояновский... in Scrapy
SoHard 🎄
еще может быть реклама на каждой странице
да много чего может быть теоретически, но версия с редиректом мне понравилась)
источник

YB

Yaswanth Bangaru in Scrapy
Yaswanth Bangaru
Thanks for that, on a side note, I read that running a selenium scraper running on a server is pretty similar to running it in my local pc with headless flag. Should I expect and be prepared for any surprises?
I don't know if this doubt is related. But my selenium scraper is running fine on ther server, but I was wondering if I could connect to the same server with a new terminal and run a second scraper? How do I find out if multiple connections are possible to the server
источник

S

SoHard 🎄 in Scrapy
Yaswanth Bangaru
I don't know if this doubt is related. But my selenium scraper is running fine on ther server, but I was wondering if I could connect to the same server with a new terminal and run a second scraper? How do I find out if multiple connections are possible to the server
tmux
источник

YB

Yaswanth Bangaru in Scrapy
Exactly what I'm looking for. Thanks
источник

К

Кирилл in Scrapy
Yaswanth Bangaru
I don't know if this doubt is related. But my selenium scraper is running fine on ther server, but I was wondering if I could connect to the same server with a new terminal and run a second scraper? How do I find out if multiple connections are possible to the server
Of course it's possible. But, your scraper should manage instances of selenium by itself and distribute urls and etc. Otherwise multiple instances of your scraper will do the same job, and there is no point in that
источник

YB

Yaswanth Bangaru in Scrapy
Кирилл
Of course it's possible. But, your scraper should manage instances of selenium by itself and distribute urls and etc. Otherwise multiple instances of your scraper will do the same job, and there is no point in that
Okay, so, do you mean I could as well fetch multiple urls at the same time using a single scraper Instead?
источник

К

Кирилл in Scrapy
Also you could use suprvisor, it'll run as many instances of your scraper as you want in the background as daemons
источник

К

Кирилл in Scrapy
Yaswanth Bangaru
Okay, so, do you mean I could as well fetch multiple urls at the same time using a single scraper Instead?
Yes, a process of scraper which will spawn selenium browsers in threads or sub-processes, and sync their tasks and results
источник

YB

Yaswanth Bangaru in Scrapy
Кирилл
Yes, a process of scraper which will spawn selenium browsers in threads or sub-processes, and sync their tasks and results
I could use supervisor for this?
источник

К

Кирилл in Scrapy
Supervisor is just a process runner, it does't know about your needs
источник

К

Кирилл in Scrapy
Кирилл
Yes, a process of scraper which will spawn selenium browsers in threads or sub-processes, and sync their tasks and results
You have to program this business logic
источник

YB

Yaswanth Bangaru in Scrapy
I see. I'll Check it out. Don't want to mess with my scraper now
источник

YB

Yaswanth Bangaru in Scrapy
What does "connection reset by [vps IP address] port 22" mean?
источник

YB

Yaswanth Bangaru in Scrapy
Is it related to internet connectivity on my end in any way? I'm running a web-scraper and it stops between loops
источник