How to get the complete url from a https request when the "connect" event is emitted?

71 views Asked by At

The following code is a simplified part of this answer about a proxy, just in case:

var server = http.createServer(function (req, res) {}).listen(8080);
server.addListener('connect', function (req, socket, bodyhead) {});

The listener inside createServer get executed when the request is http and the listener of the connect event get executed when the request is https.

The thing is I cannot get the complete url from req in function (req, socket, bodyhead) {}, req.url only shows the domain and the port number e.g www.google-analytics.com:443 I'd like it to be something like https://www.google-analytics.com/a/b?f=...

EDIT: I tried with new URL(req.url, `http://${req.url}`) with no avail, for a request like https://www.example.com/favicon.ico this returns an object:

Symbol(context): URLContext {flags: 258, scheme: 'www.wordreference.com:', username: '', password: '', host: null, …}
Symbol(query): URLSearchParams {Symbol(query): Array(0), Symbol(context): URL}
hash: ""
host: ""
hostname: ""
href: "www.example.com:443"
origin: "null"
password: ""
pathname: "443"
port: ""
protocol: "www.example.com:"
search: ""
searchParams: URLSearchParams
username: ""
Symbol(cannot-be-base): (...)
Symbol(cannot-have-username-password-port): (...)
Symbol(special): (...)

Only shows the domain and the port.

1

There are 1 answers

0
daego On

Looks like it's not possible for a proxy get the url, only the domain part, since the request being of https type is encrypted and the proxy cannot see not even the url, here is the link.

HTTPS traffic often reveals a domain name. For example, when viewing https://www.wireshark.org in a web browser, a pcap would show www.wireshark.org as the server name for this traffic when viewed in a customized Wireshark column display. Unfortunately, we don’t know other details like the actual URL or data returned from the server.