{-# LANGUAGE BangPatterns , GeneralizedNewtypeDeriving , DoAndIfThenElse , OverloadedStrings , RecordWildCards , UnicodeSyntax #-} -- |This is the Resource Monad; monadic actions to define a behavior -- of resource. The 'Resource' Monad is a kind of 'IO' Monad thus it -- implements 'MonadIO' class, and it is a state machine as well. -- -- Request Processing Flow: -- -- 1. A client issues an HTTP request. -- -- 2. If the URI of it matches to any resource, the corresponding -- 'Resource' Monad starts running on a newly spawned thread. -- -- 3. The 'Resource' Monad looks at request headers, find (or not -- find) an entity, receive the request body (if any), send -- response headers, and then send a response body. This process -- will be discussed later. -- -- 4. The 'Resource' Monad and its thread stops running. The client -- may or may not be sending us the next request at this point. -- -- 'Resource' Monad takes the following states. The initial state is -- /Examining Request/ and the final state is /Done/. -- -- [/Examining Request/] In this state, a 'Resource' looks at the -- request header fields and thinks about the corresponding entity -- for it. If there is a suitable entity, the 'Resource' tells the -- system an entity tag and its last modification time -- ('foundEntity'). If it found no entity, it tells the system so -- ('foundNoEntity'). In case it is impossible to decide the -- existence of entity, which is a typical case for POST requests, -- 'Resource' does nothing in this state. -- -- [/Receiving Body/] A 'Resource' asks the system to receive a -- request body from the client. Before actually reading from the -- socket, the system sends \"100 Continue\" to the client if need -- be. When a 'Resource' transits to the next state without -- receiving all or part of a request body, the system automatically -- discards it. -- -- [/Deciding Header/] A 'Resource' makes a decision of response -- status code and header fields. When it transits to the next -- state, the system validates and completes the header fields and -- then sends them to the client. -- -- [/Sending Body/] In this state, a 'Resource' asks the system to -- write some response body to the socket. When it transits to the -- next state without writing any response body, the system -- automatically completes it depending on the status code. (To be -- exact, such completion only occurs when the 'Resource' transits -- to this state without even declaring the \"Content-Type\" header -- field. See 'setContentType'.) -- -- [/Done/] Everything is over. A 'Resource' can do nothing for the -- HTTP interaction anymore. -- -- Note that the state transition is one-way: for instance, it is an -- error to try to read a request body after writing some -- response. This limitation is for efficiency. We don't want to read -- the entire request before starting 'Resource', nor we don't want to -- postpone writing the entire response till the end of 'Resource' -- computation. module Network.HTTP.Lucu.Resource ( -- * Types Resource , ResourceDef(..) , emptyResource , FormData(..) -- * Getting request header -- |These functions can be called regardless of the current state, -- and they don't change the state of 'Resource'. , getConfig , getRemoteAddr , getRemoteAddr' , getRemoteHost , getRemoteCertificate , getRequest , getMethod , getRequestURI , getRequestVersion , getResourcePath , getPathInfo , getQueryForm , getHeader , getAccept , getAcceptEncoding , isEncodingAcceptable , getContentType , getAuthorization -- * Finding an entity -- |These functions can be called only in the /Examining Request/ -- state. They make the 'Resource' transit to the /Receiving Body/ -- state. , foundEntity , foundETag , foundTimeStamp , foundNoEntity , foundNoEntity' -- * Receiving a request body -- |These functions make the 'Resource' transit to the /Receiving -- Body/ state. , getChunk , getChunks , getForm -- * Declaring response status and header fields -- |These functions can be called at any time before transiting to -- the /Sending Body/ state, but they themselves never causes any -- state transitions. , setStatus , redirect , setContentType , setContentEncoding , setWWWAuthenticate -- ** Less frequently used functions , setLocation , setHeader , deleteHeader -- * Sending a response body -- |These functions make the 'Resource' transit to the -- /Sending Body/ state. , putChunk , putChunks , putBuilder ) where import Blaze.ByteString.Builder (Builder) import qualified Blaze.ByteString.Builder as BB import qualified Blaze.ByteString.Builder.Internal as BB import Control.Applicative import Control.Monad import Control.Monad.IO.Class import Control.Monad.Unicode import Data.Ascii (Ascii, CIAscii) import qualified Data.Ascii as A import qualified Data.Attoparsec.Char8 as P import qualified Data.Attoparsec.Lazy as LP import Data.ByteString (ByteString) import qualified Data.ByteString as Strict import qualified Data.ByteString.Lazy as Lazy import Data.List import qualified Data.Map as M import Data.Maybe import Data.Monoid import Data.Monoid.Unicode import Data.Text (Text) import qualified Data.Text as T import qualified Data.Text.Encoding as T import Data.Time import qualified Data.Time.HTTP as HTTP import Network.HTTP.Lucu.Abortion import Network.HTTP.Lucu.Authentication import Network.HTTP.Lucu.Config import Network.HTTP.Lucu.ContentCoding import Network.HTTP.Lucu.ETag import qualified Network.HTTP.Lucu.Headers as H import Network.HTTP.Lucu.HttpVersion import Network.HTTP.Lucu.Interaction import Network.HTTP.Lucu.MultipartForm import Network.HTTP.Lucu.Request import Network.HTTP.Lucu.Resource.Internal import Network.HTTP.Lucu.Response import Network.HTTP.Lucu.MIMEType import Network.HTTP.Lucu.Utils import Network.Socket hiding (accept) import Network.URI hiding (path) import Prelude.Unicode -- |Get the string representation of the address of remote host. If -- you want a 'SockAddr' instead of 'HostName', use 'getRemoteAddr'. getRemoteAddr' ∷ Resource HostName getRemoteAddr' = do sa ← getRemoteAddr (fromJust ∘ fst) <$> (liftIO $ getNameInfo [NI_NUMERICHOST] True False sa) -- |Resolve an address to the remote host. getRemoteHost ∷ Resource (Maybe HostName) getRemoteHost = do sa ← getRemoteAddr fst <$> (liftIO $ getNameInfo [] True False sa) -- |Get the 'Method' value of the request. getMethod ∷ Resource Method getMethod = reqMethod <$> getRequest -- |Get the URI of the request. getRequestURI ∷ Resource URI getRequestURI = reqURI <$> getRequest -- |Get the HTTP version of the request. getRequestVersion ∷ Resource HttpVersion getRequestVersion = reqVersion <$> getRequest -- |This is an analogy of CGI PATH_INFO. 'getPathInfo' always returns -- @[]@ if the corresponding -- 'Network.HTTP.Lucu.Resource.Tree.ResourceDef' is not greedy. See -- 'getResourcePath'. -- -- Note that the returned path components are URI-decoded. getPathInfo ∷ Resource [Strict.ByteString] getPathInfo = do rsrcPath ← getResourcePath reqPath ← splitPathInfo <$> getRequestURI return $ drop (length rsrcPath) reqPath -- |Assume the query part of request URI as -- application\/x-www-form-urlencoded, and parse it into pairs of -- @(name, formData)@. This function doesn't read the request -- body. Field names are decoded in UTF-8 for an hardly avoidable -- reason. See 'getForm'. getQueryForm ∷ Resource [(Text, FormData)] getQueryForm = parse' <$> getRequestURI where parse' = map toPairWithFormData ∘ parseWWWFormURLEncoded ∘ fromJust ∘ A.fromChars ∘ drop 1 ∘ uriQuery toPairWithFormData ∷ (ByteString, ByteString) → (Text, FormData) toPairWithFormData (name, value) = let fd = FormData { fdFileName = Nothing , fdContent = Lazy.fromChunks [value] } in (T.decodeUtf8 name, fd) -- |@'getHeader' name@ returns the value of the request header field -- @name@. Comparison of header name is case-insensitive. Note that -- this function is not intended to be used so frequently: there -- should be functions like 'getContentType' for every common headers. getHeader ∷ CIAscii → Resource (Maybe Ascii) getHeader name = H.getHeader name <$> getRequest -- |Return the list of 'MIMEType' enumerated on the value of request -- header \"Accept\", or @[]@ if absent. getAccept ∷ Resource [MIMEType] getAccept = do acceptM ← getHeader "Accept" case acceptM of Nothing → return [] Just accept → case P.parseOnly p (A.toByteString accept) of Right xs → return xs Left _ → abort $ mkAbortion' BadRequest $ "Unparsable Accept: " ⊕ A.toText accept where p = do xs ← mimeTypeListP P.endOfInput return xs -- |Return the list of @(contentCoding, qvalue)@ enumerated on the -- value of request header \"Accept-Encoding\". The list is sorted in -- descending order by qvalue. getAcceptEncoding ∷ Resource [(CIAscii, Maybe Double)] getAcceptEncoding = do accEncM ← getHeader "Accept-Encoding" case accEncM of Nothing -- HTTP/1.0 には Accept-Encoding が無い場合の規定が無い -- ので安全の爲 identity が指定された事にする。HTTP/1.1 -- の場合は何でも受け入れて良い事になってゐるので "*" が -- 指定された事にする。 → do ver ← getRequestVersion case ver of HttpVersion 1 0 → return [("identity", Nothing)] HttpVersion 1 1 → return [("*" , Nothing)] _ → abort $ mkAbortion' InternalServerError "getAcceptEncoding: unknown HTTP version" Just ae → if ae ≡ "" then -- identity のみが許される。 return [("identity", Nothing)] else case P.parseOnly p (A.toByteString ae) of Right xs → return $ map toTuple $ reverse $ sort xs Left _ → abort $ mkAbortion' BadRequest $ "Unparsable Accept-Encoding: " ⊕ A.toText ae where p = do xs ← acceptEncodingListP P.endOfInput return xs toTuple (AcceptEncoding {..}) = (aeEncoding, aeQValue) -- |Return 'True' iff a given content-coding is acceptable by the -- client. isEncodingAcceptable ∷ CIAscii → Resource Bool isEncodingAcceptable encoding = any doesMatch <$> getAcceptEncoding where doesMatch ∷ (CIAscii, Maybe Double) → Bool doesMatch (e, q) = (e ≡ "*" ∨ e ≡ encoding) ∧ q ≢ Just 0 -- |Return the value of request header \"Content-Type\" as 'MIMEType'. getContentType ∷ Resource (Maybe MIMEType) getContentType = do cTypeM ← getHeader "Content-Type" case cTypeM of Nothing → return Nothing Just cType → case P.parseOnly p (A.toByteString cType) of Right t → return $ Just t Left _ → abort $ mkAbortion' BadRequest $ "Unparsable Content-Type: " ⊕ A.toText cType where p = do t ← mimeTypeP P.endOfInput return t -- |Return the value of request header \"Authorization\" as -- 'AuthCredential'. getAuthorization ∷ Resource (Maybe AuthCredential) getAuthorization = do authM ← getHeader "Authorization" case authM of Nothing → return Nothing Just auth → case P.parseOnly p (A.toByteString auth) of Right ac → return $ Just ac Left _ → return Nothing where p = do ac ← authCredentialP P.endOfInput return ac -- |Tell the system that the 'Resource' found an entity for the -- request URI. If this is a GET or HEAD request, a found entity means -- a datum to be replied. If this is a PUT or DELETE request, it means -- a datum which was stored for the URI until now. For POST requests -- it raises an error. -- -- 'foundEntity' performs \"If-Match\" test or \"If-None-Match\" test -- whenever possible, and if those tests fail, it immediately aborts -- with status \"412 Precondition Failed\" or \"304 Not Modified\" -- depending on the situation. -- -- If the request method is either GET or HEAD, 'foundEntity' -- automatically puts \"ETag\" and \"Last-Modified\" headers into the -- response. foundEntity ∷ ETag → UTCTime → Resource () foundEntity tag timeStamp = do driftTo ExaminingRequest method ← getMethod when (method ≡ GET ∨ method ≡ HEAD) $ setHeader "Last-Modified" (HTTP.toAscii timeStamp) when (method ≡ POST) $ abort $ mkAbortion' InternalServerError "foundEntity: this is a POST request." foundETag tag driftTo ReceivingBody -- |Tell the system that the 'Resource' found an entity for the -- request URI. The only difference from 'foundEntity' is that -- 'foundETag' doesn't (nor can't) put \"Last-Modified\" header into -- the response. -- -- Using this function is discouraged. You should use 'foundEntity' -- whenever possible. foundETag ∷ ETag → Resource () foundETag tag = do driftTo ExaminingRequest method ← getMethod when (method ≡ GET ∨ method ≡ HEAD) $ setHeader "ETag" $ A.fromAsciiBuilder $ printETag tag when (method ≡ POST) $ abort $ mkAbortion' InternalServerError "Illegal computation of foundETag for POST request." -- If-Match があればそれを見る。 ifMatch ← getHeader "If-Match" case ifMatch of Nothing → return () Just value → if value ≡ "*" then return () else case P.parseOnly p (A.toByteString value) of Right tags -- tags の中に一致するものが無ければ -- PreconditionFailed で終了。 → when ((¬) (any (≡ tag) tags)) $ abort $ mkAbortion' PreconditionFailed $ "The entity tag doesn't match: " ⊕ A.toText value Left _ → abort $ mkAbortion' BadRequest $ "Unparsable If-Match: " ⊕ A.toText value let statusForNoneMatch = if method ≡ GET ∨ method ≡ HEAD then NotModified else PreconditionFailed -- If-None-Match があればそれを見る。 ifNoneMatch ← getHeader "If-None-Match" case ifNoneMatch of Nothing → return () Just value → if value ≡ "*" then abort $ mkAbortion' statusForNoneMatch $ "The entity tag matches: *" else case P.parseOnly p (A.toByteString value) of Right tags → when (any (≡ tag) tags) $ abort $ mkAbortion' statusForNoneMatch $ "The entity tag matches: " ⊕ A.toText value Left _ → abort $ mkAbortion' BadRequest $ "Unparsable If-None-Match: " ⊕ A.toText value driftTo ReceivingBody where p = do xs ← eTagListP P.endOfInput return xs -- |Tell the system that the 'Resource' found an entity for the -- request URI. The only difference from 'foundEntity' is that -- 'foundTimeStamp' performs \"If-Modified-Since\" test or -- \"If-Unmodified-Since\" test instead of \"If-Match\" test or -- \"If-None-Match\" test. Be aware that any tests based on a last -- modification time are unsafe because it is possible to mess up such -- tests by modifying the entity twice in a second. -- -- Using this function is discouraged. You should use 'foundEntity' -- whenever possible. foundTimeStamp ∷ UTCTime → Resource () foundTimeStamp timeStamp = do driftTo ExaminingRequest method ← getMethod when (method ≡ GET ∨ method ≡ HEAD) $ setHeader "Last-Modified" (HTTP.toAscii timeStamp) when (method ≡ POST) $ abort $ mkAbortion' InternalServerError "Illegal computation of foundTimeStamp for POST request." let statusForIfModSince = if method ≡ GET ∨ method ≡ HEAD then NotModified else PreconditionFailed -- If-Modified-Since があればそれを見る。 ifModSince ← getHeader "If-Modified-Since" case ifModSince of Just str → case HTTP.fromAscii str of Right lastTime → when (timeStamp ≤ lastTime) $ abort $ mkAbortion' statusForIfModSince $ "The entity has not been modified since " ⊕ A.toText str Left _ → return () -- 不正な時刻は無視 Nothing → return () -- If-Unmodified-Since があればそれを見る。 ifUnmodSince ← getHeader "If-Unmodified-Since" case ifUnmodSince of Just str → case HTTP.fromAscii str of Right lastTime → when (timeStamp > lastTime) $ abort $ mkAbortion' PreconditionFailed $ "The entity has not been modified since " ⊕ A.toText str Left _ → return () -- 不正な時刻は無視 Nothing → return () driftTo ReceivingBody -- |@'foundNoEntity' mStr@ tells the system that the 'Resource' found -- no entity for the request URI. @mStr@ is an optional error message -- to be replied to the client. -- -- If the request method is PUT, 'foundNoEntity' performs \"If-Match\" -- test and when that fails it aborts with status \"412 Precondition -- Failed\". If the request method is GET, HEAD, POST or DELETE, -- 'foundNoEntity' always aborts with status \"404 Not Found\". foundNoEntity ∷ Maybe Text → Resource () foundNoEntity msgM = do driftTo ExaminingRequest method ← getMethod when (method ≢ PUT) $ abort $ mkAbortion NotFound [] msgM -- エンティティが存在しないと云ふ事は、"*" も含めたどのやうな -- If-Match: 條件も滿たさない。 ifMatch ← getHeader "If-Match" when (ifMatch ≢ Nothing) $ abort $ mkAbortion PreconditionFailed [] msgM driftTo ReceivingBody -- |'foundNoEntity'' is the same as @'foundNoEntity' 'Nothing'@. foundNoEntity' ∷ Resource () {-# INLINE foundNoEntity' #-} foundNoEntity' = foundNoEntity Nothing -- |@'getChunks' limit@ attemts to read the entire request body up to -- @limit@ bytes, and then make the 'Resource' transit to the -- /Deciding Header/ state. When the actual size of the body is larger -- than @limit@ bytes, 'getChunks' immediately aborts with status -- \"413 Request Entity Too Large\". When the request has no body, it -- returns an empty string. -- -- When the @limit@ is 'Nothing', 'getChunks' uses the default -- limitation value ('cnfMaxEntityLength') instead. -- -- 'getChunks' returns a lazy 'Lazy.ByteString' but it's not really -- lazy: reading from the socket just happens at the computation of -- 'getChunks', not at the evaluation of the 'Lazy.ByteString'. getChunks ∷ Maybe Int → Resource Lazy.ByteString getChunks (Just n) | n < 0 = fail ("getChunks: limit must not be negative: " ⧺ show n) | n ≡ 0 = return (∅) | otherwise = getChunks' n getChunks Nothing = getConfig ≫= getChunks ∘ Just ∘ cnfMaxEntityLength getChunks' ∷ Int → Resource Lazy.ByteString getChunks' limit = go limit (∅) where go ∷ Int → Builder → Resource Lazy.ByteString go 0 _ = do chunk ← getChunk 1 if Strict.null chunk then return (∅) else abort $ mkAbortion' RequestEntityTooLarge $ "Request body must be smaller than " ⊕ T.pack (show limit) ⊕ " bytes." go !n !b = do c ← getChunk $ min n BB.defaultBufferSize if Strict.null c then -- Got EOF return $ BB.toLazyByteString b else do let n' = n - Strict.length c xs' = b ⊕ BB.fromByteString c go n' xs' -- |@'getForm' limit@ attempts to read the request body with -- 'getChunks' and parse it as @application\/x-www-form-urlencoded@ or -- @multipart\/form-data@. If the request header \"Content-Type\" is -- neither of them, 'getForm' aborts with status \"415 Unsupported -- Media Type\". If the request has no \"Content-Type\", it aborts -- with \"400 Bad Request\". -- -- Field names in @multipart\/form-data@ will be precisely decoded in -- accordance with RFC 2231. On the other hand, -- @application\/x-www-form-urlencoded@ says nothing about character -- encodings for field names, so they'll always be decoded in -- UTF-8. (This could be a bad design, but I can't think of any better -- idea.) getForm ∷ Maybe Int → Resource [(Text, FormData)] getForm limit = do cTypeM ← getContentType case cTypeM of Nothing → abort $ mkAbortion' BadRequest "Missing Content-Type" Just (MIMEType "application" "x-www-form-urlencoded" _) → readWWWFormURLEncoded Just (MIMEType "multipart" "form-data" params) → readMultipartFormData params Just cType → abort $ mkAbortion' UnsupportedMediaType $ A.toText $ A.fromAsciiBuilder $ A.toAsciiBuilder "Unsupported media type: " ⊕ printMIMEType cType where readWWWFormURLEncoded = (map toPairWithFormData ∘ parseWWWFormURLEncoded) <$> (bsToAscii =≪ getChunks limit) bsToAscii bs = case A.fromByteString (Strict.concat (Lazy.toChunks bs)) of Just a → return a Nothing → abort $ mkAbortion' BadRequest "Malformed x-www-form-urlencoded" readMultipartFormData params = case M.lookup "boundary" params of Nothing → abort $ mkAbortion' BadRequest "Missing boundary of multipart/form-data" Just boundary → do src ← getChunks limit b ← case A.fromText boundary of Just b → return b Nothing → abort $ mkAbortion' BadRequest $ "Malformed boundary: " ⊕ boundary case LP.parse (p b) src of LP.Done _ formList → return formList LP.Fail _ eCtx e → abort $ mkAbortion' BadRequest $ "Unparsable multipart/form-data: " ⊕ T.pack (intercalate ", " eCtx) ⊕ ": " ⊕ T.pack e where p b = do xs ← multipartFormP b P.endOfInput return xs -- |@'redirect' code uri@ declares the response status as @code@ and -- \"Location\" header field as @uri@. The @code@ must satisfy -- 'isRedirection' or it raises an error. redirect ∷ StatusCode → URI → Resource () redirect code uri = do when (code ≡ NotModified ∨ not (isRedirection code)) $ abort $ mkAbortion' InternalServerError $ A.toText $ A.fromAsciiBuilder $ A.toAsciiBuilder "Attempted to redirect with status " ⊕ printStatusCode code setStatus code setLocation uri -- |@'setContentType' mType@ declares the response header -- \"Content-Type\" as @mType@. Declaring \"Content-Type\" is -- mandatory for sending a response body. setContentType ∷ MIMEType → Resource () setContentType = setHeader "Content-Type" ∘ A.fromAsciiBuilder ∘ printMIMEType -- |@'setLocation' uri@ declares the response header \"Location\" as -- @uri@. You usually don't need to call this function directly. setLocation ∷ URI → Resource () setLocation uri = case A.fromChars uriStr of Just a → setHeader "Location" a Nothing → abort $ mkAbortion' InternalServerError $ "Malformed URI: " ⊕ T.pack uriStr where uriStr = uriToString id uri "" -- |@'setContentEncoding' codings@ declares the response header -- \"Content-Encoding\" as @codings@. setContentEncoding ∷ [CIAscii] → Resource () setContentEncoding codings = do ver ← getRequestVersion tr ← case ver of HttpVersion 1 0 → return (toAB ∘ unnormalizeCoding) HttpVersion 1 1 → return toAB _ → abort $ mkAbortion' InternalServerError "setContentEncoding: Unknown HTTP version" setHeader "Content-Encoding" $ A.fromAsciiBuilder $ mconcat $ intersperse (A.toAsciiBuilder ", ") $ map tr codings where toAB = A.toAsciiBuilder ∘ A.fromCIAscii -- |@'setWWWAuthenticate' challenge@ declares the response header -- \"WWW-Authenticate\" as @challenge@. setWWWAuthenticate ∷ AuthChallenge → Resource () setWWWAuthenticate = setHeader "WWW-Authenticate" ∘ printAuthChallenge -- |Write a chunk in 'Strict.ByteString' to the response body. You -- must first declare the response header \"Content-Type\" before -- applying this function. See 'setContentType'. putChunk ∷ Strict.ByteString → Resource () putChunk = putBuilder ∘ BB.fromByteString -- |Write a chunk in lazy 'Lazy.ByteString' to the response body. It -- can be safely applied to an infinitely long 'Lazy.ByteString'. -- -- Note that you must first declare the response header -- \"Content-Type\" before applying this function. See -- 'setContentType'. putChunks ∷ Lazy.ByteString → Resource () putChunks = putBuilder ∘ BB.fromLazyByteString