Skip to content

Cata1022/Parallel-news-aggregator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

PARALLEL ALGORITHMS

News Aggregator

1. Implementation

The goal is the implementation of a parallel program in Java which simulates the operation of a news aggregator. The program will process a large volume of articles, organize them by categories and languages, and generate statistics and aggregated reports. The final result will be a set of text files that reflect the information extracted from the input articles in a structured and deterministic manner.

2. Context, objectives, and motivation

The volume of information available online is very large, being permanently generated by social networks, press websites, and various digital platforms. While access to these data is easier than ever, their abundance also brings difficulties: the content is often repetitive, hard to organize, and difficult to follow in a coherent form. News aggregators offer a solution to this challenge. They collect articles from various sources and present them in a structured format, adapted to the users' need to quickly browse relevant topics. At the same time, they can highlight topics of major interest, such as public health or politics.

The project aims to reproduce, on a smaller scale, the functionality of such an aggregator. The goal is to illustrate how a large volume of articles can be processed in parallel, how they can be grouped according to categories, and how clear and consistent reports can be generated, in a manner similar to real applications that support organized access to information.

My personal goal regarding this project was:

  • To practice their programming skills using Java Threads;
  • To practice the decomposition of a problem described in natural language into subproblems that can be executed in parallel;
  • To practice the decision-making process for identifying a scalable parallel solution, by approaching the proposed problem.

3. Technical details

For the implementation of the project, a program written in the Java programming language will be used. The main goal is the parallel processing of a set of news articles located in .json files.

Article structure

Each file contains several articles, represented in JSON format, with multiple fields. For this project, I will use the values of the keys uuid, title, author, url, text, published, language, and categories

Parallel processing

A central objective of the assignment is the use of parallel programming to work with a large volume of articles.

About

A parallel news aggregator built in Java that uses multithreading to efficiently process large volumes of JSON articles and generate structured statistical reports

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors