/settings/ads/Keeps Popup beim Scrappen Google

Ich habe ein Programm, das Google kratzt, es ist ein Open-Source-Schwachstelle Scraper, der mechanize verwendet, um Google zu suchen. Es verwendet eine zufällige Suchabfrage, die in einer Textdatei bereitgestellt wird, um zu entscheiden, wonach gesucht werden soll./settings/ads/Keeps Popup beim Scrappen Google

Ich werde die Hauptdatei und einen Link zum Git wegen der Größe des Programms veröffentlichen.

Anyways, ich habe dieses Programm, das für Websites zu kratzen, aber verwendet wird, während es jeden jetzt kratzen und dann kommt es über eine ‚URL‘ (ich sage, dass die leichte Schulter), die wie folgt aussieht:

[17:05:02 INFO]I'll run in default mode! 
[17:05:02 INFO]I'm searching for possible SQL vulnerable sites, using search query inurl:/main.php?f1= 

[17:05:04 SUCCESS]Site found: http://forix.autosport.com/main.php?l=0&c=1 
[17:05:05 SUCCESS]Site found: https://zweeler.com/formula1/FantasyFormula12016/main.php?ref=103 
[17:05:06 SUCCESS]Site found: https://en.zweeler.com/formula1/FantasyFormula1YearGame2015/main.php 
[17:05:07 SUCCESS]Site found: http://modelcargo.com/main.php?mod=sambachoose&dep=samba 
[17:05:08 SUCCESS]Site found: http://www.ukdirt.co.uk/main.php?P=rules&f=8 
[17:05:09 SUCCESS]Site found: http://www.ukdirt.co.uk/main.php?P=tracks&g=2&d=2&m=0 
[17:05:11 SUCCESS]Site found: http://zoohoo.sk/redir.php?q=v%FDsledok&url=http%3A%2F%2Flivescore.sk%2Fmain.php%3Flang%3Dsk 
[17:05:12 SUCCESS]Site found: http://www.chemical-plus.com/main.php?f1=pearl_pigment.htm 
[17:05:13 SUCCESS]Site found: http://www.fantasyf1.co/main.php 
[17:05:14 SUCCESS]Site found: http://www.escritores.cl/base.php?f1=escritores/main.php 
[17:05:15 SUCCESS]Site found: /settings/ads/preferences?hl=en #<= Right here

Wenn dies auftaucht, stürzt es das Programm vollständig ab. Ich habe versucht, Folgendes zu tun:

next if urls == '/settings/ads/preferences?hl=en' 
next if urls =~ /preferences?hl=en/ 
next if urls.split('/')[2] == 'ads/preferences?hl=en'

Allerdings bleibt es auftauchen. Ich sollte auch erwähnen, die letzten 5 Zeichen auf Ihrer Standorte abhängen, bisher habe ich gesehen:

hl=en 
hl=ru 
hl=ia

jemand eine Idee Hat, was das ist, habe ich einige der Forschung getan und wahrsten Sinne des Wortes kann nichts finden darauf. Jede Hilfe dabei wäre fantastisch.

Hauptquelle:

#!/usr/local/env ruby 

require 'rubygems' 
require 'bundler/setup' 
require 'mechanize' 
require 'nokogiri' 
require 'rest-client' 
require 'timeout' 
require 'uri' 
require 'fileutils' 
require 'colored' 
require 'yaml' 
require 'date' 
require 'optparse' 
require 'tempfile' 
require 'socket' 
require 'net/http' 
require_relative 'lib/modules/format.rb' 
require_relative 'lib/modules/credits.rb' 
require_relative 'lib/modules/legal.rb' 
require_relative 'lib/modules/spider.rb' 
require_relative 'lib/modules/copy.rb' 
require_relative 'lib/modules/site_info.rb' 

include Format 
include Credits 
include Legal 
include Whitewidow 
include Copy 
include SiteInfo 

PATH = Dir.pwd 
VERSION = Whitewidow.version 
SEARCH = File.readlines("#{PATH}/lib/search_query.txt").sample 
info = YAML.load_file("#{PATH}/lib/rand-agents.yaml") 
@user_agent = info['user_agents'][info.keys.sample] 
OPTIONS = {} 

def usage_page 
    Format.usage("You can run me with the following flags: #{File.basename(__FILE__)} -[d|e|h] -[f] <path/to/file/if/any>") 
    exit 
end 

def examples_page 
    Format.usage('This is my examples page, I\'ll show you a few examples of how to get me to do what you want.') 
    Format.usage('Running me with a file: whitewidow.rb -f <path/to/file> keep the file inside of one of my directories.') 
    Format.usage('Running me default, if you don\'t want to use a file, because you don\'t think I can handle it, or for whatever reason, you can run me default by passing the Default flag: whitewidow.rb -d this will allow me to scrape Google for some SQL vuln sites, no guarentees though!') 
    Format.usage('Running me with my Help flag will show you all options an explanation of what they do and how to use them') 
    Format.usage('Running me without a flag will show you the usage page. Not descriptive at all but gets the point across') 
end 

OptionParser.new do |opt| 
    opt.on('-f FILE', '--file FILE', 'Pass a file name to me, remember to drop the first slash. /tmp/txt.txt <= INCORRECT tmp/text.txt <= CORRECT') { |o| OPTIONS[:file] = o } 
    opt.on('-d', '--default', 'Run me in default mode, this will allow me to scrape Google using my built in search queries.') { |o| OPTIONS[:default] = o } 
    opt.on('-e', '--example', 'Shows my example page, gives you some pointers on how this works.') { |o| OPTIONS[:example] = o } 
end.parse! 

def page(site) 
    Nokogiri::HTML(RestClient.get(site)) 
end 

def parse(site, tag, i) 
    parsing = page(site) 
    parsing.css(tag)[i].to_s 
end 

def format_file 
    Format.info('Writing to temporary file..') 
    if File.exists?(OPTIONS[:file]) 
    file = Tempfile.new('file') 
    IO.read(OPTIONS[:file]).each_line do |s| 
     File.open(file, 'a+') { |format| format.puts(s) unless s.chomp.empty? } 
    end 
    IO.read(file).each_line do |file| 
     File.open("#{PATH}/tmp/#sites.txt", 'a+') { |line| line.puts(file) } 
    end 
    file.unlink 
    Format.info("File: #{OPTIONS[:file]}, has been formatted and saved as #sites.txt in the tmp directory.") 
    else 
    puts <<-_END_ 

      Hey now my friend, I know you're eager, I am also, but that file #{OPTIONS[:file]} 
      either doesn't exist, or it's not in the directory you say it's in.. 

      I'm gonna need you to go find that file, move it to the correct directory and then 
      run me again. 

      Don't worry I'll wait! 
    _END_ 
    .yellow.bold 
    end 
end 

def get_urls 
    Format.info("I'll run in default mode!") 
    Format.info("I'm searching for possible SQL vulnerable sites, using search query #{SEARCH}") 
    agent = Mechanize.new 
    agent.user_agent = @user_agent 
    page = agent.get('http://www.google.com/') 
    google_form = page.form('f') 
    google_form.q = "#{SEARCH}" 
    url = agent.submit(google_form, google_form.buttons.first) 
    url.links.each do |link| 
    if link.href.to_s =~ /url.q/ 
     str = link.href.to_s 
     str_list = str.split(%r{=|&}) 
     urls = str_list[1] 
     next if urls.split('/')[2].start_with? 'stackoverflow.com', 'github.com', 'www.sa-k.net', 'yoursearch.me', 'search1.speedbit.com', 'duckfm.net', 'search.clearch.org', 'webcache.googleusercontent.com' 
     next if urls == '/settings/ads/preferences?hl=en' #<= ADD HERE REMEMBER A COMMA => 
     urls_to_log = URI.decode(urls) 
     Format.success("Site found: #{urls_to_log}") 
     sleep(1) 
     sql_syntax = ["'", "`", "--", ";"].each do |sql| 
     File.open("#{PATH}/tmp/SQL_sites_to_check.txt", 'a+') { |s| s.puts("#{urls_to_log}#{sql}") } 
     end 
    end 
    end 
    Format.info("I've dumped possible vulnerable sites into #{PATH}/tmp/SQL_sites_to_check.txt") 
end 

def vulnerability_check 
    case 
    when OPTIONS[:default] 
    file_to_read = "tmp/SQL_sites_to_check.txt" 
    when OPTIONS[:file] 
    Format.info("Let's check out this file real quick like..") 
    file_to_read = "tmp/#sites.txt" 
    end 
    Format.info('Forcing encoding to UTF-8') unless OPTIONS[:file] 
    IO.read("#{PATH}/#{file_to_read}").each_line do |vuln| 
    begin 
     Format.info("Parsing page for SQL syntax error: #{vuln.chomp}") 
     Timeout::timeout(10) do 
     vulns = vuln.encode(Encoding.find('UTF-8'), {invalid: :replace, undef: :replace, replace: ''}) 
     begin 
      if parse("#{vulns.chomp}'", 'html', 0)[/You have an error in your SQL syntax/] 
      Format.site_found(vulns.chomp) 
      File.open("#{PATH}/tmp/SQL_VULN.txt", "a+") { |s| s.puts(vulns) } 
      sleep(1) 
      else 
      Format.warning("URL: #{vulns.chomp} is not vulnerable, dumped to non_exploitable.txt") 
      File.open("#{PATH}/log/non_exploitable.txt", "a+") { |s| s.puts(vulns) } 
      sleep(1) 
      end 
     rescue Timeout::Error, OpenSSL::SSL::SSLError 
      Format.warning("URL: #{vulns.chomp} failed to load dumped to non_exploitable.txt") 
      File.open("#{PATH}/log/non_exploitable.txt", "a+") { |s| s.puts(vulns) } 
      next 
      sleep(1) 
     end 
     end 
    rescue RestClient::ResourceNotFound, RestClient::InternalServerError, RestClient::RequestTimeout, RestClient::Gone, RestClient::SSLCertificateNotVerified, RestClient::Forbidden, OpenSSL::SSL::SSLError, Errno::ECONNREFUSED, URI::InvalidURIError, Errno::ECONNRESET, Timeout::Error, OpenSSL::SSL::SSLError, Zlib::GzipFile::Error, RestClient::MultipleChoices, RestClient::Unauthorized, SocketError, RestClient::BadRequest, RestClient::ServerBrokeConnection, RestClient::MaxRedirectsReached => e 
     Format.err("URL: #{vuln.chomp} failed due to an error while connecting, URL dumped to non_exploitable.txt") 
     File.open("#{PATH}/log/non_exploitable.txt", "a+") { |s| s.puts(vuln) } 
     next 
    end 
    end 
end 

case 
    when OPTIONS[:default] 
    begin 
     Whitewidow.spider 
     sleep(1) 
     Credits.credits 
     sleep(1) 
     Legal.legal 
     get_urls 
     vulnerability_check unless File.size("#{PATH}/tmp/SQL_sites_to_check.txt") == 0 
     Format.warn("No sites found for search query: #{SEARCH}. Logging into error_log.LOG. Create a issue regarding this.") if File.size("#{PATH}/tmp/SQL_sites_to_check.txt") == 0 
     File.open("#{PATH}/log/error_log.LOG", 'a+') { |s| s.puts("No sites found with search query #{SEARCH}") } if File.size("#{PATH}/tmp/SQL_sites_to_check.txt") == 0 
     File.truncate("#{PATH}/tmp/SQL_sites_to_check.txt", 0) 
     Format.info("I'm truncating SQL_sites_to_check file back to #{File.size("#{PATH}/tmp/SQL_sites_to_check.txt")}") 
     Copy.file("#{PATH}/tmp/SQL_VULN.txt", "#{PATH}/log/SQL_VULN.LOG") 
     File.truncate("#{PATH}/tmp/SQL_VULN.txt", 0) 
     Format.info("I've run all my tests and queries, and logged all important information into #{PATH}/log/SQL_VULN.LOG") 
    rescue Mechanize::ResponseCodeError, RestClient::ServiceUnavailable, OpenSSL::SSL::SSLError, RestClient::BadGateway => e 
     d = DateTime.now 
     Format.fatal("Well this is pretty crappy.. I seem to have encountered a #{e} error. I'm gonna take the safe road and quit scanning before I break something. You can either try again, or manually delete the URL that caused the error.") 
     File.open("#{PATH}/log/error_log.LOG", 'a+'){ |error| error.puts("[#{d.month}-#{d.day}-#{d.year} :: #{Time.now.strftime("%T")}]#{e}") } 
     Format.info("I'll log the error inside of #{PATH}/log/error_log.LOG for further analysis.") 
    end 
    when OPTIONS[:file] 
    begin 
     Whitewidow.spider 
     sleep(1) 
     Credits.credits 
     sleep(1) 
     Legal.legal 
     Format.info('Formatting file') 
     format_file 
     vulnerability_check 
     File.truncate("#{PATH}/tmp/SQL_sites_to_check.txt", 0) 
     Format.info("I'm truncating SQL_sites_to_check file back to #{File.size("#{PATH}/tmp/SQL_sites_to_check.txt")}") 
     Copy.file("#{PATH}/tmp/SQL_VULN.txt", "#{PATH}/log/SQL_VULN.LOG") 
     File.truncate("#{PATH}/tmp/SQL_VULN.txt", 0) 
     Format.info("I've run all my tests and queries, and logged all important information into #{PATH}/log/SQL_VULN.LOG") unless File.size("#{PATH}/log/SQL_VULN.LOG") == 0 
    rescue Mechanize::ResponseCodeError, RestClient::ServiceUnavailable, OpenSSL::SSL::SSLError, RestClient::BadGateway => e 
     d = DateTime.now 
     Format.fatal("Well this is pretty crappy.. I seem to have encountered a #{e} error. I'm gonna take the safe road and quit scanning before I break something. You can either try again, or manually delete the URL that caused the error.") 
     File.open("#{PATH}/log/error_log.LOG", 'a+'){ |error| error.puts("[#{d.month}-#{d.day}-#{d.year} :: #{Time.now.strftime("%T")}]#{e}") } 
     Format.info("I'll log the error inside of #{PATH}/log/error_log.LOG for further analysis.") 
    end 
    when OPTIONS[:example] 
    examples_page 
    else 
    Format.warning('You failed to pass me a flag!') 
    usage_page 
end

Gibt es etwas in diesem Code, das würde dazu führen, diese zufällig Popup? Es passiert nur mit zufälligen Suchanfragen.

Link to GitHub

UPDATE:

Ive entdeckt, dass Googles Werbung Dienstleistungen Link die gleiche Ausdehnung in seiner URL als eine gibt mir Probleme hat .. aber das erklärt nicht, warum ich Ich bekomme diesen Link und warum kann ich nicht darüber hinwegsehen.

Quelle

2016-04-17 13aal

Nichts von irgendjemandem ...? – 13aal

Wie kann man Ihren Fehler reproduzieren und Ihnen helfen, ihn ** rechtlich ** zu beheben? Ist der 'Standard' Modus legal ...? – rdupz

Es gibt nichts Illegales daran? Es findet nur, nicht Exploits – 13aal

urls = "settings/ads/preferences?hl=ru" 

if urls =~ /settings\/ads\/preferences\?hl=[a-z]{2}/ 
    p "I'm skipped" 
end 

=> "I'm skipped"

Quelle

2016-04-18 13:40:45 rdupz

Ist eine Regex die einzige Möglichkeit, es zu tun? – 13aal

Das funktioniert nicht. – 13aal

/settings/ads/Keeps Popup beim Scrappen Google

Antwort

Verwandte Themen