-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathbookmark_parser.rb
More file actions
48 lines (40 loc) · 1.56 KB
/
bookmark_parser.rb
File metadata and controls
48 lines (40 loc) · 1.56 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
=begin
==========================================================================
Bookmark Parser -
Parses a bookmark.html file generated by a browser (currently tested in
Firefox, Chrome and Safari) and return a hash with each title and url.
Copyright (C) 2012 Jorge Rodríguez
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
==========================================================================
=end
require 'uri'
class BookmarkParser
=begin
self.parseFile - Processes a HTML file with the bookmarks.
--
Input: filepath (e.g. '/home/user/documents/mybookmarks.html')
Output:
- hash with the title and url of every bookmark
- nil if the input file is not valid
=end
def self.parseFile(filepath)
result = nil
# Only parses HTML files
if filepath.end_with? ".html"
result = []
File.open(filepath, "r").each_line do |line|
result << {url: URI.escape($1), title: URI.escape($2)} if line =~ /<dt>\s*<a href="(http[^"]+)"[^>]*>([^<]+)</i
end
end
return result
end
end