Skip to content
2 changes: 1 addition & 1 deletion sources/academy/platform/getting_started/apify_client.md
Original file line number Diff line number Diff line change
Expand Up @@ -189,7 +189,7 @@ from apify_client import ApifyClient

client = ApifyClient(token='YOUR_TOKEN')

actor = client.actor('YOUR_USERNAME/adding-actor').call(run_input={
run = client.actor('YOUR_USERNAME/adding-actor').call(run_input={
'num1': 4,
'num2': 2
})
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ In the page that opens, you can see your newly created Actor. In the **Settings*

First, we'll start with the `requirements.txt` file. Its purpose is to list all the third-party packages that your Actor will use. We will be using the `pandas` package for parsing the downloaded weather data, and the `matplotlib` package for visualizing it. We don't care about versions of these packages, so we list just their names:

```py
```text
# Add your dependencies here.
# See https://pip.pypa.io/en/latest/cli/pip_install/#requirements-file-format
# for how to format them
Expand All @@ -44,6 +44,8 @@ The Actor's main logic will live in the `main.py` file. Let's delete everything

Next, we'll import all the packages we will use in the code:

<!-- group doccmd[all]: start -->

```py
from io import BytesIO
import os
Expand Down Expand Up @@ -127,6 +129,8 @@ print(f'Result is available at {os.environ["APIFY_API_PUBLIC_BASE_URL"]}'
+ f'/v2/key-value-stores/{os.environ["APIFY_DEFAULT_KEY_VALUE_STORE_ID"]}/records/prediction.png')
```

<!-- group doccmd[all]: end -->

And that's it! Now you can save the changes in the editor, and then click **Build and run** at the bottom of the page. The Actor will get built, the built Actor image will get saved for future re-use, and then it will be executed. You can follow the progress of the Actor build and the Actor run in the **Last build** and **Last run** tabs, respectively, in the developer console in the Actor source view. Once the Actor finishes running, it will output the URL where you can access the plot we created in its log.

![Building and running the BBC Weather Parser Actor](./images/bbc-weather-parser-source.png)
Expand Down
12 changes: 10 additions & 2 deletions sources/academy/tutorials/python/scrape_data_python.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ In the page that opens, you can see your newly created Actor. In the **Settings*

First we'll start with the `requirements.txt` file. Its purpose is to list all the third-party packages that your Actor will use. We will be using the `requests` package for downloading the BBC Weather pages, and the `beautifulsoup4` package for parsing and processing the downloaded pages. We don't care about versions of these packages, so we list just their names:

```py
```text
# Add your dependencies here.
# See https://pip.pypa.io/en/latest/cli/pip_install/#requirements-file-format
# for how to format them
Expand All @@ -78,6 +78,8 @@ Finally, we can get to writing the main logic for the Actor, which will live in

First, we need to import all the packages we will use in the code:

<!-- group doccmd[all]: start -->

```py
from datetime import datetime, time, timedelta, timezone
import os
Expand Down Expand Up @@ -205,6 +207,8 @@ default_dataset_client.push_items(weather_data)
print(f'Results have been saved to the dataset with ID {os.environ["APIFY_DEFAULT_DATASET_ID"]}')
```

<!-- group doccmd[all]: end -->

### Running the Actor

And that's it! Now you can save the changes in the editor, and then click **Build and run** at the bottom of the page. The Actor will get built, the built Actor image will get saved for future reuse, and then it will be executed. You can follow the progress of the Actor build and the Actor run in the **Last build** and **Last run** tabs, respectively, in the developer console in the Actor source view. Once the Actor finishes running, you can view the scraped data in the **Dataset** tab in the Actor run view.
Expand All @@ -231,7 +235,7 @@ In the page that opens, you can see your newly created Actor. In the **Settings*

First, we'll start with the `requirements.txt` file. Its purpose is to list all the third-party packages that your Actor will use. We will be using the `pandas` package for parsing the downloaded weather data, and the `matplotlib` package for visualizing it. We don't care about versions of these packages, so we list just their names:

```py
```text
# Add your dependencies here.
# See https://pip.pypa.io/en/latest/cli/pip_install/#requirements-file-format
# for how to format them
Expand All @@ -244,6 +248,8 @@ The Actor's main logic will live in the `main.py` file. Let's delete everything

Next, we'll import all the packages we will use in the code:

<!-- group doccmd[all]: start -->

```py
from io import BytesIO
import os
Expand Down Expand Up @@ -327,6 +333,8 @@ print(f'Result is available at {os.environ["APIFY_API_PUBLIC_BASE_URL"]}'
+ f'/v2/key-value-stores/{os.environ["APIFY_DEFAULT_KEY_VALUE_STORE_ID"]}/records/prediction.png')
```

<!-- group doccmd[all]: end -->

And that's it! Now you can save the changes in the editor, and then click **Build and run** at the bottom of the page. The Actor will get built, the built Actor image will get saved for future re-use, and then it will be executed. You can follow the progress of the Actor build and the Actor run in the **Last build** and **Last run** tabs, respectively, in the developer console in the Actor source view. Once the Actor finishes running, it will output the URL where you can access the plot we created in its log.

![Building and running the BBC Weather Parser Actor](./images/bbc-weather-parser-source.png)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,7 @@ if (priceText.startsWith("From ")) {

Great! Only if we didn't overlook an important pitfall called [floating-point error](https://en.wikipedia.org/wiki/Floating-point_error_mitigation). In short, computers save floating point numbers in a way which isn't always reliable:

```py
```pycon
> 0.1 + 0.2
0.30000000000000004
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ Now let's test that all works. Inside the project directory we'll create a new f
```py
import httpx

print("OK")
print("OK", httpx.__version__)
```

Running it as a Python program will verify that our setup is okay and we've installed HTTPX:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,8 @@ Now let's use it for parsing the HTML. The `BeautifulSoup` object allows us to w

We'll update our code to the following:

<!-- group doccmd[all]: start -->

```py
import httpx
from bs4 import BeautifulSoup
Expand Down Expand Up @@ -74,6 +76,8 @@ first_heading = headings[0]
print(first_heading.text)
```

<!-- group doccmd[all]: end -->

If we run our scraper again, it prints the text of the first `h1` element:

```text
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -158,12 +158,14 @@ When translated to a tree of Python objects, the element with class `price` will

We can use Beautiful Soup's `.contents` property to access individual nodes. It returns a list of nodes like this:

```py
```text
["\n", <span class="visually-hidden">Sale price</span>, "$74.95"]
```

It seems like we can read the last element to get the actual amount. Let's fix our program:

<!-- group doccmd[all]: start -->

```py
import httpx
from bs4 import BeautifulSoup
Expand Down Expand Up @@ -198,6 +200,8 @@ The results seem to be correct, but they're hard to verify because the prices vi
print(title, price, sep=" | ")
```

<!-- group doccmd[all]: end -->

The output is much nicer this way:

```text
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,12 @@ It's because some products have variants with different prices. Later in the cou

Ideally we'd go and discuss the problem with those who are about to use the resulting data. For their purposes, is the fact that some prices are just minimum prices important? What would be the most useful representation of the range for them? Maybe they'd tell us that it's okay if we just remove the `From` prefix?

<!-- group doccmd[all]: start -->

<!-- invisible-code-block: py
product = object
-->

```py
price_text = product.select_one(".price").contents[-1]
price = price_text.removeprefix("From ")
Expand All @@ -51,6 +57,8 @@ else:
price = min_price
```

<!-- group doccmd[all]: end -->

:::tip Built-in string methods

If you're not proficient in Python's string methods, [.startswith()](https://docs.python.org/3/library/stdtypes.html#str.startswith) checks the beginning of a given string, and [.removeprefix()](https://docs.python.org/3/library/stdtypes.html#str.removeprefix) removes something from the beginning of a given string.
Expand All @@ -59,6 +67,8 @@ If you're not proficient in Python's string methods, [.startswith()](https://doc

The whole program would look like this:

<!-- group doccmd[all]: start -->

```py
import httpx
from bs4 import BeautifulSoup
Expand Down Expand Up @@ -112,7 +122,7 @@ These might be useful in some complex scenarios, but in our case, they won't mak

We got rid of the `From` and possible whitespace, but we still can't save the price as a number in our Python program:

```py
```pycon
>>> price = "$1,998.00"
>>> float(price)
Traceback (most recent call last):
Expand Down Expand Up @@ -154,7 +164,7 @@ else:

Great! Only if we didn't overlook an important pitfall called [floating-point error](https://en.wikipedia.org/wiki/Floating-point_error_mitigation). In short, computers save floating point numbers in a way which isn't always reliable:

```py
```pycon
>>> 0.1 + 0.2
0.30000000000000004
```
Expand All @@ -174,6 +184,8 @@ price_text = (
)
```

<!-- group doccmd[all]: end -->

In this case, removing the dot from the price text is the same as if we multiplied all the numbers with 100, effectively converting dollars to cents. For converting the text to a number we'll use `int()` instead of `float()`. This is how the whole program looks like now:

```py
Expand Down
6 changes: 4 additions & 2 deletions sources/platform/proxy/datacenter_proxy.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,8 @@ await Actor.exit();

```python
from apify import Actor
import requests, asyncio
import asyncio
import requests

async def main():
async with Actor:
Expand Down Expand Up @@ -258,7 +259,8 @@ await Actor.exit();

```python
from apify import Actor
import requests, asyncio
import asyncio
import requests

async def main():
async with Actor:
Expand Down
Loading