|
Okay, I see what you were doing. Notice that x*sech^2(w^T x) is a vector - this is what you're saying is the gradient of f in this other problem. And that is indeed the correct answer.
Now going back to the original problem, let me just write the answer so you can see the differences: (1/2)(sum from i to m)x^(i)(w^Tx^(i)-y^(i)) / |w^Tx^(i)-y^(i)|. Apologies for misreading your answer. You are using some shortcuts to calculate the whole gradient at once. Which is fine, but I wasn't expecting it.
Let me give you a quick calculus review. So, the output of f(w) is a number (it looks like a loss function). However, f(w) is a multivariable function, it has as many variables as there are elements in w. Now, I'm assuming that w is a vector, not a matrix, although it definitely could be, lets assume for now that it is a vector. You talk about the derivative of f with respect to w, however think about the function g(z) = z_1+2z_2+3z_3 where z=(z_1, z_2,z_3) is some vector. Now you probably can guess that the 'derivative' of g is equal to (1,2,3), but strictly speaking, you can only take a derivative with respect to a single dimensional variable (i.e. something that varies over the real numbers and therefore isnt a vector). Thus we can't calculate dg/dz, we can only calculate dg/dz_1, dg/dz_2, and dg/dz_3. So we simply DEFINE dg/dz to be the VECTOR (dg/dz_1, dg/dz_2, dg/dz_3), which you can easily check is equal to (1,2,3). Similarly, the derivative of f(w) is defined to be the vector (df/dw_1, ..., df/dw_D). You can check your work and see that the answer I gave you can be found using either your method or the method of partial derivatives. I'm actually lying a bit to you here, but it's very useful (and I'm sure your teacher expects you) to think of the derivative of g/f this way. It will help to demystify some of your derivations and let you check your work using basic calculus.
Earlier you mentioned that you were anxious about taking derivatives of matrices - so since you've mentioned it again, let me briefly justify the fact that d/dw (w^T x) = x. Let x be an arbitrary matrix, then we can think of it as a linear transformation that maps vectors w to xw. Therefore, if we denote that linear transformation by f(w), we have f(w) = xw. Now, a derivative is supposed to be the best linear approximation of a function at each point. But a linear transformation is linear everywhere, so it is ITSELF the best linear approximation to ITSELF at every point. Thus the best linear approximation (i.e. the derivative) of f(w) is x. So d/dw(f(w)) = x. Hopefully it gives you some peace of mind at least, even if it isn't a rigorous argument.
|
how are you so nice here when you're so mean to foreign pros in power rankings
just teasing, thanks for all the responses. I'll check this out closely when I am done with dinner.
|
Oh, and you asked about whether since x^(i) is a vector, that x is a matrix. You haven't defined x yet, I have no clue what it is. It's possible that in your book, the notation used is that x^(i) is the i'th column or row of x. If that is the case, then x would be a matrix. If your book does not use that notation though, then x isn't anything - it is undefined, so don't think about it and don't let it appear in your derivations unless you define it first.
edit: I'm an enigma.
|
I solved the last two derivatives on my own... they were horrible and long but they used the same concepts.
In other news, I am being forced to turn my homework in using latex, and it includes graphs and stuff. This does not seem worth it, it is making my homework take SOOOO long to do, and I expect that people outside of academia don't really end up using this so I am not sure why the professor wouldn't let us write it on a damn piece of paper
|
If your professor is okay with it, you can cut down the time it takes to put graphs in the document by using snipping tool (or screen shot) to take a picture of the graph, then use \includegraphics{mypicture.png} to insert the graph crudely into the doc. You can also manually adjust the size of the picture.
You gotta include \usepackage{graphicx} in the preamble though to do that.
Also, http://detexify.kirelabs.org/classify.html is your best friend. Just draw any symbol and it will tell you how to make it, and if you need a special package to do so.
|
that symbol classifier - whoever made that is a hero
|
Latex is great. Drawing graphs with latex is absurd. Just use whatever and include them as images. Tikz is awesome for making pretty diagrams, but there are much better and easier ways of drawing graphs.
Yes, nobody outside of (computer) science uses it, but you're being the trained as a computer scientist, so I don't see the problem with learning it. We had to learn it in 1st year CS, and once you get the hang of it, you'll never want to use Word again ever.
|
Well I don't use word either(unless I am forced to), I just write stuff on paper and then scan it to pdf. As long as the handwriting is legible I can't imagine latex being superior to that, but maybe I am wrong?
I suppose I would want it for publishing any papers, though...
|
On February 23 2019 10:49 travis wrote: Well I don't use word either(unless I am forced to), I just write stuff on paper and then scan it to pdf. As long as the handwriting is legible I can't imagine latex being superior to that, but maybe I am wrong?
I suppose I would want it for publishing any papers, though... "As long as the handwriting is legible" is the key point there, though. Remember that your professor is going to have to correct and grade this work for everybody in your class, not just you.
|
LaTeX is one of my hidden shames, since I use it quite a bit but have never actually spent the time to sit down and learn how it really works. In that sense, a lot of it is 'magic' to me since I mostly know the proper rituals and incantations but would have a hard time rebuilding my collection of useful patterns from scratch.
For graphs (if we're talking figures) I've never used LaTeX directly. I usually use gnuplot (ugly) and then include the PNG into the LaTeX document. There's much nicer graphing libraries out there, I've just never spent the time to find them.
For drawing graphs (if we're talking graph-theory) I've mostly used the Tikz library, as mentioned. I found the Tikz automata library very useful for drawing things like DFAs, NFAs, etc. in courses that required it.
In some math electives I took I ended up using LaTeX for all the coursework. The profs seemed to appreciate it, and I found that it actually cut down on the writing time overall.
|
On February 24 2019 03:10 Mr. Wiggles wrote: I usually use gnuplot (ugly) and then include the PNG into the LaTeX document.
Gnuplot can export to vectorised formats...
|
I wanted to verify a true or false section on my homework that I did, it's abstract algebra groups. I just want to make sure I am understanding this stuff correctly. Note that the * denotes that the set does not contain 0
T/F:
1.) Q(under addition) us a subgroup of R (under addition): true
2.) R* (under addition) is a subgroup of R (under addition): false (no identity in R*)
3.) R* (under multiplication) us a subgroup of R (with addition): false (they need to be same operation right?)
4.) {1,-1} (under multiplication) is a subgroup of R* (under multiplication): true
5.): {2^k : k in Z} (under multiplication) is a subgroup of Q* (under multiplication): false (no zero in Q* so no inverse for 2^k... am I understanding that right?)
|
5) seems true to me. look at it again, but the lack of 0 doesn't matter.
|
At first glance I would also consider 3 true.
|
On February 27 2019 02:20 mahrgell wrote: At first glance I would also consider 3 true. While I can't remember the exact definition of a subgroup, so maybe it just isn't even defined properly if the operators are different. But mostly R* doesn't have 0, so doesn't have the ident for addition, and thus is not a group with addition, and therefore also not a subgroup.
|
On February 27 2019 02:29 Acrofales wrote:Show nested quote +On February 27 2019 02:20 mahrgell wrote: At first glance I would also consider 3 true. While I can't remember the exact definition of a subgroup, so maybe it just isn't even defined properly if the operators are different. But mostly R* doesn't have 0, so doesn't have the ident for addition, and thus is not a group with addition, and therefore also not a subgroup. For the multiplication the 1 is the same as the 0 for the addition.
|
On February 27 2019 02:45 mahrgell wrote:Show nested quote +On February 27 2019 02:29 Acrofales wrote:On February 27 2019 02:20 mahrgell wrote: At first glance I would also consider 3 true. While I can't remember the exact definition of a subgroup, so maybe it just isn't even defined properly if the operators are different. But mostly R* doesn't have 0, so doesn't have the ident for addition, and thus is not a group with addition, and therefore also not a subgroup. For the multiplication the 1 is the same as the 0 for the addition. Yes... so?
A group is not a subgroup of an other just because it is a group and the elements are a subset. It has to include the identity.
And googling it: https://en.wikipedia.org/wiki/Subgroup
Including the identity is not enough either, it definitely needs to be a group under the same operator, so can't simply redefine the group operator in the subgroup. So (3) fails on both those points.
|
On February 27 2019 02:52 Acrofales wrote:Show nested quote +On February 27 2019 02:45 mahrgell wrote:On February 27 2019 02:29 Acrofales wrote:On February 27 2019 02:20 mahrgell wrote: At first glance I would also consider 3 true. While I can't remember the exact definition of a subgroup, so maybe it just isn't even defined properly if the operators are different. But mostly R* doesn't have 0, so doesn't have the ident for addition, and thus is not a group with addition, and therefore also not a subgroup. For the multiplication the 1 is the same as the 0 for the addition. Yes... so? A group is not a subgroup of an other just because it is a group and the elements are a subset. It has to include the identity. And googling it: https://en.wikipedia.org/wiki/SubgroupIncluding the identity is not enough either, it definitely needs to be a group under the same operator, so can't simply redefine the group operator in the subgroup. So (3) fails on both those points.
At least we learned a more theoretical version, where it is enough to be isomorph to a subgroup (by the your/wiki definition) of supposed supergroup to be considered a subgroup in a group theoretical point of view. How you name your elements and ops really doesn't matter then. And then obviously your isomorphism images the identity of one group onto the identity of the other group.
Now there is a trivial isomorphism (ln(x)) between (R+\{0}, *) and (R,+).
This is in fact, what led me to my initial thought. But at least right now, I can't expand this to {R\{0},*}. Which leads me to believe that this is indeed false, just for very different reasons.
|
I think (3) is highlighting the issue about the operator and not the identity (Travis is correct, different operator so not a subgroup). Needing the identity was the purpose of example (2).
(5) is true and I don't know where the confusion came from. The lack of zero in Q* is irrelevant. The inverse for 2^k is 2^(-k) so you are fine when k ranges over Z.
Edit: To the above, being isomorphic to a subgroup is not the same as being a subgroup. There are times when the distinction is not important so we treat isomorphic subgroups as subgroups (mathematicians are often lazy like that). Sometimes however it can cause problems and it is important to remember they are different.
|
ah right, the identity in 5.) is 1.. not 0. the confusion was that I was thinking 2^0 = 1 .. temporarily forgetting that the member of my group is 2^k, not k. if that confuses you don't worry about it, my thinking clearly didn't make sense. glad I posted them though!
|
|
|
|