PyScript app that scapes text from the html of a inputted URL only refreshes page when submitted

46 views Asked by At

I'm new to programming so I don't know how on or off the mark I am but either way I'm stuck.

I wanted to try making a webpage that uses a python script to scrape all the text from the html file of a inputted URL. The thing that I'm having trouble with is I don't know which part went wrong, is it just with the query selector in the grab_data function? And I just saw while I'm posting this that the submit button isn't working anymore after I changed something but I can figure that one out

Originally it worked how I expected when I put the link in the code manually, before I tried using PyScript and was just making it to use in my editor

But then as practice I wanted to figure out how to put it on a website where you can input a link and it will give the text.

I tried using it on PyScript and nothing would happen when you submitted the link, no errors in console, the page just refreshes.

the html:



<body>
    
<div class="body">
    <div>Placeholder</div>
    <div class="lyricsApp">
        <form class="lyricsInputForm">
                
                <input type="text" id="urlInput" placeholder="Paste a link!">
                <button  id="lyricsButton" type="submit" py-click="grab_data()">Get Lyrics</button>
            
        </form>
            
        <p class="lyricsOutput">Placeholder</p>
        
    </div>
    <div>Placeholder 2</div>
    
<script type="py" src="./main.py" config="./pyscript.toml" terminal></script>


</body>
</html>

Python:

from urllib.request import Request, urlopen
from bs4 import BeautifulSoup
from pyscript import document



def grab_data():
    
    url_input = document.querySelector("#urlInput")
    url_event = url_input.value
    disguise_request = Request(url_event, headers={"User-Agent": "Mozilla/5.0"})
    data = urlopen(disguise_request).read()
    return data



html_data = grabData()
soup = BeautifulSoup(html_data, "html.parser")
raw_text = soup.get_text()
stripped_text = rawText.stripped_strings


def send_back(stripped_text)

    for filter in stripped_text:
            
        if len(filter) > 60 and len(filter) < 8:
                pass
           else:
                return filter
        
    output_text = filter
        
    outputElement = document.getElementById("#lyricsOutput")
    outputElement.innertext = output_text
        
    
send_back(stripped_text)

And I know I probably wrote the pyscript.toml file wrong but I think there's more things not working than just this:

packages = ["Request", "urlopen", "BeautifulSoup" ]

1

There are 1 answers

0
cclauss On

You could also prompt() the user for input()...

import contextlib

with contextlib.suppress(ImportError):
    from pyscript import window
    input = window.prompt

link = input('Please provide a link')