Skip to content

python code in get_noaa_smoke does not pass pep8 check #2

@brianhigh

Description

@brianhigh

When testing the coding style compliance of the python code in the get_noaa_smoke demo, I found several compliance issues using pep8online.com.

Modified code which passes the pep8 tests follows:

import re
import os.path
import urlparse
import scrapy
from scrapy.http import Request
from scrapy.crawler import CrawlerProcess


class get_hms_shapefiles(scrapy.Spider):
    """Get daily SMOKE files from May through Sept. for the years 2008-2017."""
    name = "get_hms_shapefiles"
    domain = "satepsanone.nesdis.noaa.gov"
    allowed_domains = [domain]
    start_urls = ["http://%s/pub/volcano/FIRE/HMS_ARCHIVE/%s/GIS/SMOKE/" %
                  (domain, year) for year in range(2008, 2017)]

    def parse(self, response):
        for href in response.xpath('//a/@href').extract():
            regexp = r'hms_smoke[0-9]{4}0[5-9]{1}[0-9]{2}\.(dbf|shp|shx)\.gz$'
            if re.match(regexp, href):
                yield Request(url=response.urljoin(href),
                              callback=self.save_file)

    def save_file(self, response):
        path = response.url.split('/')[-1]
        if not os.path.exists(path):
            with open(path, 'wb') as f:
                f.write(response.body)

process = CrawlerProcess()
process.crawl(get_hms_shapefiles) & process.start()

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions