pylucene - ModuleNotFoundError: No module named 'org'

614 views Asked by At
# Common imports:
import sys
from os import path, listdir

from org.apache.lucene.document import Document, Field, StringField, TextField
from org.apache.lucene.util import Version
from org.apache.lucene.store import RAMDirectory
from datetime import datetime

# Indexer imports:
from org.apache.lucene.analysis.miscellaneous import LimitTokenCountAnalyzer
from org.apache.lucene.analysis.standard import StandardAnalyzer
from org.apache.lucene.index import IndexWriter, IndexWriterConfig
# from org.apache.lucene.store import SimpleFSDirectory

# Retriever imports:
from org.apache.lucene.search import IndexSearcher
from org.apache.lucene.index import DirectoryReader
from org.apache.lucene.queryparser.classic import QueryParser

# ---------------------------- global constants ----------------------------- #

BASE_DIR = path.dirname(path.abspath(sys.argv[0]))
INPUT_DIR = BASE_DIR + "/input/"
INDEX_DIR = BASE_DIR + "/lucene_index/"

I'm trying test pylucene library. I have written this code only for import test. It doesn't work. I get

bigissue@vmi995554:~/myluceneproj$  cd /home/bigissue/myluceneproj ; /usr/bin/env /usr/bin/python3.10 /home/bigissue/.vscode/extensions/ms-python.python-2022.16.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher 36991 -- /home/bigissue/myluceneproj/hello_lucene.py 
Traceback (most recent call last):
  File "/home/bigissue/myluceneproj/hello_lucene.py", line 29, in <module>
    from org.apache.lucene.document import Document, Field, StringField, TextField
ModuleNotFoundError: No module named 'org'
bigissue@vmi995554:~/myluceneproj$ 

I have run python3.10 -m pip list and there is "lucene" module. if I import lucene work well but python doesn't recognize org module. Why?

UPDATE

I downloaded lucene 9.1 and set environment variable (/etc/environment):

CLASSPATH=".:/usr/lib/jvm/temurin-17-jdk-amd64/lib:/home/bigissue/all_lucene/lucene-9.4.1/modules:/home/bigissue/all_lucene/lucene-9.1.0/modules" export CLASSPATH

I downloaded pylucene-9.1.0 and I have installed it first jcc

bigissue@vmi995554:~/all_lucene/pylucene-9.1.0$ pwd
/home/bigissue/all_lucene/pylucene-9.1.0/jcc
bigissue@vmi995554:~/all_lucene/pylucene-9.1.0$python3.10 setup.py build
bigissue@vmi995554:~/all_lucene/pylucene-9.1.0$python3.10 setup.py install

I downloaded also ant apache.

then pylucene 9.1 cd .. I have edit Makefile vim /home/bigissue/all_lucene/pylucene-9.1.0/Makefile

PREFIX_PYTHON=/usr/bin
ANT=/home/bigissue/all_lucene/apache-ant-1.10.12
PYTHON=$(PREFIX_PYTHON)/python3.10
JCC=$(PYTHON) -m jcc --shared
NUM_FILES=10
bigissue@vmi995554:~/all_lucene/pylucene-9.1.0: make 
bigissue@vmi995554:~/all_lucene/pylucene-9.1.0: make install

if I run python3.10 -m pip install | grep -i "lucene" I see it.

bigissue@vmi995554:~/all_lucene/pylucene-9.1.0$ python3.10 -m pip list | grep -i "lucene"
lucene                 9.1.0

Now I have imported lucene

import sys
from os import path, listdir
from lucene import * 

directory = RAMDirectory()

But I get

ImportError: cannot import name 'RAMDirectory' from 'lucene' (/usr/local/lib/python3.10/dist-packages/lucene-9.1.0-py3.10-linux-x86_64.egg/lucene/__init__.py)
2

There are 2 answers

4
xavvvv On

Python doesn't use that kind of imports. Just import lucene.

If this doesn't fix your problem, sorry!

1
andrewJames On

You can use from lucene import whatever.

See the Features documentation, where it states:

"The PyLucene API exposes all Java Lucene classes in a flat namespace in the PyLucene module."

So, in Java you use import org.apache.lucene.index.IndexReader; but in PyLucene you use from lucene import IndexReader.


Update

Regarding the latest error you mentioned in the comments to your question:

ImportError: cannot import name 'RAMDirectory' from 'lucene'

Lucene's RAMDirectory has been deprecated for a long time - and was finally removed from version 9.0 of Lucene.

You can use a different directory implementation.

Recommended: MMapDirectory - but there are other options such as ByteBuffersDirectory

(Just to note, a new error/issue should really be addressed by asking a new question.)