Get char encoding information in scala

101 views Asked by At
import scala.io.Source

def checkCodec(filename:String): String = {
val bufferedSource = Source.fromFile(filename)
val codec:String = (bufferedSource.codec).toString
println("bufferedSource.codec - " +bufferedSource.codec)
bufferedSource.close
if(codec.equalsIgnoreCase("UTF-8")){
  return filename + " " + codec
}
else{
  return "CodecErrorDetected"
}
  }

val validFile = checkCodec(fileName)

println("The file is - "+validFile)

This function runs fine and gives "UTF-8" as the result even when the file type is .zip, incorrect file format or some corrupted file (used https://pinetools.com/corrupt-file-generator). How can I distinguish atleast the corrupted file (for eg: I changed a pdf file to .pddssee format which even doesn't exist, it is still recognized as a UTF-8 file). Need help in understanding how can I distinguish a corrupted file using scala. Is this the correct way I am checking for corrupt file?

Will appreciate your valuable input.

0

There are 0 answers